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Abstract 

The  theory  of  graph  games  with  w-regular  winning  conditions  is  the  foundation  for  modeling 
and  synthesizing  reactive  processes.  In  the  case  of  stochastic  reactive  processes,  the  corre¬ 
sponding  stochastic  graph  games  have  three  players,  two  of  them  (System  and  Environment) 
behaving  adversarially,  and  the  third  (Uncertainty)  behaving  probabilistically.  We  consider  two 
problems  for  stochastic  graph  games:  the  qualitative  problem  asks  for  the  set  of  states  from 
which  a  player  can  win  with  probability  1  ( almost-sure  winning );  and  the  quantitative  problem 
asks  for  the  maximal  probability  of  winning  ( optimal  winning)  from  each  state.  We  consider 
cc-regular  winning  conditions  formalized  as  Muller  winning  conditions.  We  present  optimal 
memory  bounds  for  pure  (deterministic)  almost-sure  winning  and  optimal  winning  strategies 
in  stochastic  graph  games  with  Muller  winning  conditions.  We  also  present  improved  memory 
bounds  for  randomized  almost-sure  winning  and  optimal  strategies.  We  study  the  complexity  of 
stochastic  Muller  games  and  show  that  the  quantitative  analysis  problem  is  PSPACE-complete. 
Our  results  are  relevant  in  synthesis  of  stochastic  reactive  processes. 
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1  Introduction 


A  stochastic  graph  game  [6]  is  played  on  a  directed  graph  with  three  kinds  of  states:  player- 1, 
player-2,  and  probabilistic  states.  At  player-1  states,  player  1  chooses  a  successor  state;  at  player-2 
states,  player  2  chooses  a  successor  state;  and  at  probabilistic  states,  a  successor  state  is  chosen 
according  to  a  given  probability  distribution.  The  result  of  playing  the  game  forever  is  an  infinite 
path  through  the  graph.  If  there  are  no  probabilistic  states,  we  refer  to  the  game  as  a  2-player 
graph  game ;  otherwise,  as  a  2l/2-player  graph  game.  There  has  been  a  long  history  of  using  2-player 
graph  games  for  modeling  and  synthesizing  reactive  processes  [1,  21,  23]:  a  reactive  system  and  its 
environment  represent  the  two  players,  whose  states  and  transitions  are  specified  by  the  states  and 
edges  of  a  game  graph.  Consequently,  2  ^-player  graph  games  provide  the  theoretical  foundation 
for  modeling  and  synthesizing  processes  that  are  both  reactive  and  stochastic  [13,  22]. 

For  the  modeling  and  synthesis  (or  “control”)  of  reactive  processes,  one  traditionally  considers 
cu-regular  winning  conditions,  which  naturally  express  the  temporal  specifications  and  fairness  as¬ 
sumptions  of  transition  systems  [17].  This  paper  focuses  on  21/2-player  graph  games  with  respect  to 
an  important  normal  form  of  w-regular  winning  conditions;  namely  Muller  winning  conditions  [24]. 

In  the  case  of  2-player  graph  games,  where  no  randomization  is  involved,  a  fundamental  determi- 
nacy  result  of  Gurevich  and  Harrington  [14]  based  on  LAR  ( latest  appearance  record)  construction 
ensures  that,  given  an  cu-regular  winning  condition,  at  each  state,  either  player  1  has  a  strategy 
to  ensure  that  the  condition  holds,  or  player  2  has  a  strategy  to  ensure  that  the  condition  does 
not  hold.  Thus,  the  problem  of  solving  2-player  graph  games  consists  in  finding  the  set  of  winning 
states ,  from  which  player  1  can  ensure  that  the  condition  holds.  Along  with  the  computation  of 
the  winning  states,  the  characterization  of  complexity  of  winning  strategies  is  a  central  question, 
since  the  winning  strategies  represent  the  implementation  of  the  controller  in  the  synthesis  prob¬ 
lem.  The  elegant  algorithm  of  Zielonka  [25]  uses  the  LAR  construction  to  compute  winning  sets 
in  2-player  graph  games  with  Muller  conditions.  In  [10]  the  authors  present  an  insightful  analysis 
of  Zielonka’s  algorithm  to  present  optimal  memory  bounds  (matching  upper  and  lower  bound)  for 
winning  strategies  in  2-player  graph  games  with  Muller  conditions. 

In  the  case  of  2  1 /2-player  graph  games,  where  randomization  is  present  in  the  transition  struc¬ 
ture,  the  notion  of  winning  needs  to  be  clarified.  Player  1  is  said  to  win  surely  if  she  has  a  strategy 
that  guarantees  to  achieve  the  winning  condition  against  all  player-2  strategies.  While  this  is  the 
classical  notion  of  winning  in  the  2-player  case,  it  is  less  meaningful  in  the  presence  of  probabilistic 
states,  because  it  makes  all  probabilistic  choices  adversarial  (it  treats  them  analogously  to  player-2 
choices).  To  adequately  treat  probabilistic  choice,  we  consider  the  probability  with  which  player  1 
can  ensure  that  the  winning  condition  is  met.  We  thus  define  two  solution  problems  for  21/2-player 
graph  games:  the  qualitative  problem  asks  for  the  set  of  states  from  which  player  1  can  ensure 
winning  with  probability  1;  the  quantitative  problem  asks  for  the  maximal  probability  with  which 
player  1  can  ensure  winning  from  each  state  (this  probability  is  called  the  value  of  the  game  at 
a  state).  Correspondingly,  we  define  almost- sure  winning  strategies ,  which  enable  player  1  to  win 
with  probability  1  whenever  possible,  and  optimal  strategies ,  which  enable  player  1  to  win  with 
maximal  probability.  The  main  result  of  this  paper  is  an  optimal  memory  bound  for  pure  (deter¬ 
ministic)  almost-sure  and  optimal  strategies  in  2  ^-player  graph  games  with  Muller  conditions.  In 
fact  we  generalize  the  elegant  analysis  of  [10]  to  present  an  upper  bound  for  optimal  strategies  for 
2 1/2-player  graph  games  with  Muller  conditions  that  matches  the  lower  bound  for  sure  winning  in 
2-player  games.  As  a  consequence  we  generalize  several  results  known  for  21/2-player  graph  games: 
such  as  existence  of  pure  memoryless  optimal  strategies  for  parity  conditions  [5,  26,  19]  and  Rabin 
conditions  [4],  We  present  the  result  for  almost-sure  strategies  in  Section  3;  and  then  generalize  it 
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to  optimal  strategies  in  Section  4.  The  results  developed  also  help  us  to  precisely  characterize  the 
complexity  of  several  classes  of  21/2-player  Muller  games.  We  show  that  the  complexity  of  quantita¬ 
tive  analysis  of  21/2-player  games  with  Muller  objectives  is  PSPACE  complete.  We  also  show  that 
for  two  special  classes  of  Muller  objectives  (namely,  union-closed  and  upward-closed  objectives)  the 
problem  is  coNP-complete.  We  also  study  the  memory  bounds  for  randomized  strategies.  In  case 
of  randomized  strategies  we  improve  the  upper  bound  for  almost-sure  and  optimal  strategies  as 
compared  to  pure  strategies  (Section  5).  The  problem  of  a  matching  upper  and  lower  bound  for 
almost-sure  and  optimal  randomized  strategies  remains  open. 

2  Definitions 

We  consider  several  classes  of  turn-based  games,  namely,  two-player  turn-based  probabilistic  games 
( 2  '^-player  games),  two-player  turn-based  deterministic  games  (2-player  games),  and  Markov  de¬ 
cision  processes  (l1/2-player  games). 

Notation.  For  a  finite  set  A,  a  probability  distribution  on  A  is  a  function  5 :  A  — >  [0, 1]  such  that 
Yla&A  ^(a)  =  1-  We  denote  the  set  of  probability  distributions  on  A  by  D(A).  Given  a  distribution 
5  G  D(A),  we  denote  by  Supp(d)  =  {x  £  A  \  8(x)  >  0}  the  support  of  8. 

Game  graphs.  A  turn-based  probabilistic  game  graph  (21 fa-player  game  graph )  G  = 
((S,  E),  (Si,  S2,  Sq),  8)  consists  of  a  directed  graph  (S,  E),  a  partition  (Si,  S2,  Sq )  of  the  finite 
set  S  of  states,  and  a  probabilistic  transition  function  8:  Sq  —>  D{S),  where  D(S)  denotes  the  set 
of  probability  distributions  over  the  state  space  S.  The  states  in  Si  are  the  player-1  states,  where 
player  1  decides  the  successor  state;  the  states  in  S2  are  the  player-2  states,  where  player  2  decides 
the  successor  state;  and  the  states  in  Sq  are  the  probabilistic  states,  where  the  successor  state  is 
chosen  according  to  the  probabilistic  transition  function  d.  We  assume  that  for  s  G  Sq  and  f  G  S, 
we  have  ( s,t )  G  E  iff  8(s)(t)  >  0,  and  we  often  write  8(s,t )  for  8(s){t).  For  technical  convenience 
we  assume  that  every  state  in  the  graph  (S,  E )  has  at  least  one  outgoing  edge.  For  a  state  sGS, 
we  write  E(s)  to  denote  the  set  {  t  G  S  |  (s,t)  G  E  }  of  possible  successors.  The  size  of  a  game 
graph  G  =  ((S,  E),  (Si,  S2,  Sq),  8)  is 

|G|  =  |S|  +  |F|  +  ]T  £  \8(s)(t)\-, 
tes  seS0 

where  |<5(s)(f)|  denotes  the  space  to  represent  the  transition  probability  8(s)(t)  in  binary. 

A  set  U  C  S  of  states  is  called  8-closed  if  for  every  probabilistic  state  u  G  U  (~l  Sq,  if  (u,  t )  G  E, 
then  t  G  U.  The  set  U  is  called  8 -live  if  for  every  nonprobabilistic  state  s  G  U  fl  (Si  U  S2),  there 
is  a  state  f  G  h  such  that  (s,t)  G  E.  A  5-closed  and  4- live  subset  U  of  S  induces  a  subgame  graph 
of  G,  indicated  by  G  |  U . 

The  turn-based  deterministic  game  graphs  ( 2-player  game  graphs )  are  the  special  case  of  the 
2y2-player  game  graphs  with  Sq  =  0.  The  Markov  decision  processes  (l1^ -player  game  graphs ) 
are  the  special  case  of  the  21/2-player  game  graphs  with  Si  =  0  or  S2  =  0-  We  refer  to  the  MDPs 
with  S2  =  0  as  player-1  MDPs,  and  to  the  MDPs  with  Si  =  0  as  player-2  MDPs. 

Plays  and  strategies.  An  infinite  path,  or  play,  of  the  game  graph  G  is  an  infinite  sequence 
lo  =  (so,  si,  s 2, . . .)  of  states  such  that  (s^,  Sfc+i)  G  E  for  all  k  G  N.  We  write  D  for  the  set  of  all 
plays,  and  for  a  state  s  G  S,  we  write  !ls  C(1  for  the  set  of  plays  that  start  from  the  state  s. 

A  strategy  for  player  1  is  a  function  a:  S*  ■  S 1  — >  V(S)  that  assigns  a  probability  distribution  to 
all  finite  sequences  w  G  S*  ■  S\  of  states  ending  in  a  player- 1  state  (the  sequence  represents  a  prefix 
of  a  play).  Player  1  follows  the  strategy  a  if  in  each  player- 1  move,  given  that  the  current  history 
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of  the  game  is  w  £  S*  ■  Sj,  she  chooses  the  next  state  according  to  the  probability  distribution 
a(iu).  A  strategy  must  prescribe  only  available  moves,  i.e.,  for  all  w  £  S* ,  and  s  £  Si  we  have 
Supp(cr(rtJ-  s))  C  E(s).  The  strategies  for  player  2  are  defined  analogously.  We  denote  by  S  and  II 
the  set  of  all  strategies  for  player  1  and  player  2,  respectively. 

Once  a  starting  state  s  £  S  and  strategies  a  £  E  and  n  £  II  for  the  two  players  are  fixed, 
the  outcome  of  the  game  is  a  random  walk  wj,7r  for  which  the  probabilities  of  events  are  uniquely 
defined,  where  an  event  A  C  ft  is  a  measurable  set  of  paths.  Given  strategies  a  for  player  1  and  n 
for  player  2,  a  play  co  =  (so,  s'i ,  S2,  •  •  •)  is  feasible  if  for  every  k  £  N  the  following  three  conditions 
hold:  (1)  if  sk  £  Sq,  then  (sk,sk+ 1)  £  E\  (2)  if  sk  £  Si,  then  cr(s0,  si, . . . ,  sk)(sk+±)  >  0;  and 
(3)  if  sk  £  S2  then  7r(so,  «i,  ■  ■  ■ ,  sk)(sk+i)  >  0.  Given  two  strategies  cr  £  S  and  7r  £  II,  and  a 
state  s  £  S,  we  denote  by  Outcome(s,  a,  7 r)  C  fts  the  set  of  feasible  plays  that  start  from  s  given 
strategies  a  and  7r.  For  a  state  s  £  S  and  an  event  A  C  12,  we  write  Prf,7r(M)  for  the  probability 
that  a  path  belongs  to  A  if  the  game  starts  from  the  state  s  and  the  players  follow  the  strategies 
a  and  7 r,  respectively.  In  the  context  of  player- 1  MDPs  we  often  omit  the  argument  it,  because  II 
is  a  singleton  set. 

We  classify  strategies  according  to  their  use  of  randomization  and  memory.  The  strategies  that 
do  not  use  randomization  are  called  pure.  A  player-1  strategy  a  is  pure  if  for  all  w  £  S*  and  s  £  Si, 
there  is  a  state  t  £  S  such  that  a(w  ■  s)(t)  =  1.  We  denote  by  Ep  C  S  the  set  of  pure  strategies 
for  player  1.  A  strategy  that  is  not  necessarily  pure  is  called  randomized.  Let  M  be  a  set  called 
memory ,  that  is,  M  is  a  set  of  memory  elements.  A  player-1  strategy  a  can  be  described  as  a  pair 
of  functions  a  =  ( au ,  crm):  a  memory-update  function  au:  S  x  M  — >  M  and  a  next-move  function  am: 
S 1  x  M  — 4  V(S).  We  can  think  of  strategies  with  memory  as  input/output  automaton  computing 
the  strategies  (see  [10]  for  details).  The  strategy  (au,am)  is  finite-memory  if  the  memory  M  is 
finite,  and  then  we  denote  the  size  of  the  memory  of  the  strategy  a  by  the  size  of  its  memory  M, 
i.e.,  |M| .  We  denote  by  Ep  the  set  of  finite-memory  strategies  for  player  1,  and  by  EPF  the  set  of 
pure  finite-memory  strategies;  that  is,  EPF  =  Ep  n  SF.  The  strategy  (au,am)  is  memoryless  if 
|M|  =  1;  that  is,  the  next  move  does  not  depend  on  the  history  of  the  play  but  only  on  the  current 
state.  A  memoryless  player-1  strategy  can  be  represented  as  a  function  a:  S 1  — ►  V(S).  A  pure 
memoryless  strategy  is  a  pure  strategy  that  is  memoryless.  A  pure  memory  less  strategy  for  player  1 
can  be  represented  as  a  function  a:  S\  — >  S.  We  denote  by  £M  the  set  of  memoryless  strategies  for 
player  1,  and  by  E PM  the  set  of  pure  memoryless  strategies;  that  is,  E PM  =  SpnEM.  Analogously 
we  define  the  corresponding  strategy  families  np,  II^ ,  APF  ,  IlM ,  and  HPM  for  player  2. 

Given  a  finite-memory  strategy  cr  €  T,p ,  let  Ga  be  the  game  graph  obtained  from  G  under  the 
constraint  that  player  1  follows  the  strategy  a.  The  corresponding  definition  G n  for  a  player-2 
strategy  7r  £  11^  is  analogous,  and  we  write  Ga iJr  for  the  game  graph  obtained  from  G  if  both 
players  follow  the  finite-memory  strategies  o  and  7r,  respectively.  Observe  that  given  a  21/2-player 
game  graph  G  and  a  finite-memory  player- 1  strategy  cr,  the  result  Ga  is  a  player-2  MDP.  Similarly, 
for  a  player-1  MDP  G  and  a  finite-memory  player-1  strategy  a,  the  result  Ga  is  a  Markov  chain. 
Hence,  if  G  is  a  2  ^-player  game  graph  and  the  two  players  follow  finite-memory  strategies  cr  and  7 r, 
the  result  Ga ;7r  is  a  Markov  chain.  These  observations  will  be  useful  in  the  analysis  of  21/2-player 
games. 

Objectives.  An  objective  for  a  player  consists  of  an  ca-regular  set  of  winning  plays  $  C  Q  [24],  In 
this  paper  we  study  zero-sum  games  [13,  22],  where  the  objectives  of  the  two  players  are  complemen¬ 
tary;  that  is,  if  the  objective  of  one  player  is  <F,  then  the  objective  of  the  other  player  is  <1  =  H  \  <F. 
We  consider  cu-regular  objectives  specified  as  Miiller  objectives.  For  a  play  u  =  (so,  si,  s 2, . . .),  let 
Inf(u;)  be  the  set  {  s  £  S  \  s  =  sk  for  infinitely  many  k  >  0  }  of  states  that  appear  infinitely  often 
in  u>.  We  use  colors  to  define  objectives  as  in  [10].  A  21/2-player  game  ( G,C,\,E  C  V(C))  consists 
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of  a  21/2-player  game  graph  G,  a  finite  set  G  of  colors,  a  partial  function  x  '■  S  — 1  G  that  assigns 
colors  to  some  states,  and  a  winning  condition  specified  by  a  subset  F  of  the  power  set  V(C)  of 
colors.  The  winning  condition  defines  subset  <h  C  12  of  winning  plays,  defined  as  follows: 

Miiller(JF)  =  {  u  e  17  |  x(Inf(u;))  €  T  } 

that  is  the  set  of  paths  co  such  that  the  colors  appearing  infinitely  often  in  to  is  in  F . 

Remarks.  A  winning  condition  F  C  V(C)  has  a  split  if  there  are  sets  Gi,G2  €  F  such  that 
C\  U  C2  fL  F.  A  winning  condition  is  a  Rabin  winning  condition  if  it  do  not  have  splits,  and  it  is  a 
Streett  winning  condition  if  T(C)  \  F  does  not  have  a  split.  This  notions  coincide  with  the  Rabin 
and  Streett  winning  conditions  usually  defined  in  the  literature  (see  [20,  10]  for  details).  We  now 
define  the  reachability,  safety,  Biichi  and  coBiichi  objectives  that  will  be  useful  in  our  proofs. 

•  Reachability  and  safety  objectives.  Given  a  set  T  C  S  of  “target”  states,  the  reachability 
objective  requires  that  some  state  of  T  be  visited.  The  set  of  winning  plays  is  thus  Reach(T)  = 
{  oj  =  (so,  si,  S2,  •  •  •)  G  R  |  Sfc  G  T  for  some  k  >  0  }.  Given  a  set  F  C  S,  the  safety  objective 
requires  that  only  states  of  F  be  visited.  Thus,  the  set  of  winning  plays  is  Safe(F)  =  {  lo  = 
(sq,  si,  S2,  •  •  •)  G  hi  |  Sfc  G  F  for  all  k  >  0  }. 

•  Biichi  and  coBiichi  objectives.  Given  a  set  B  C  S  of  “Btichi”  states,  the  Btichi  objective 
requires  that  B  is  visited  infinitely  often.  Formally,  the  set  of  winning  plays  is  Biichi(l?)  = 
{  io  €  17  Inf(w)  n  B  /  0  }.  Given  CCS ,  the  coBiichi  objective  requires  that  all  states 
visited  infinitely  often  are  in  C .  Formally,  the  set  of  winning  plays  is  coBiichi(G)  =  { to  €  17  | 
Inf(cj)  C  C  }. 

Sure,  almost-sure,  positive  winning  and  optimality.  Given  a  player-1  objective  <h,  a  strategy 
a  €  £  is  sure  winning  for  player  1  from  a  state  s  €  S  if  for  every  strategy  n  €  II  for  player  2,  we 
have  Outcome(s,  a ,  7r)  C  <J>.  A  strategy  a  is  almost-sure  winning  for  player  1  from  the  state  s  for  the 
objective  <b  if  for  every  player-2  strategy  7 r,  we  have  Pr^,7r(<h)  =  1.  A  strategy  a  is  positive  winning 
for  player  1  from  the  state  s  for  the  objective  $  if  for  every  player-2  strategy  n,  we  have  Prg,7r($)  >  0. 
The  sure,  almost-sure  and  positive  winning  strategies  for  player  2  are  defined  analogously.  Given 
an  objective  <h,  the  sure  winning  set  ((1  ))Sure(&)  for  player  1  is  the  set  of  states  from  which  player  1 
has  a  sure  winning  strategy.  Similarly,  the  almost-sure  winning  set  {{!)) almost  and  the  positive 

winning  set  ((1  j)pos(&)  for  player  1  is  the  set  of  states  from  which  player  1  has  an  almost-sure 
winning  and  a  positive  winning  strategy,  respectively.  The  sure  winning  set  ((2))sure(fl  \  <h),  the 
almost-sure  winning  set  {{2))aimost  \  $)  and  the  positive  winning  set  {{2}}pos(Q,  \  <h)  for  player  2 
are  defined  analogously.  It  follows  from  the  definitions  that  for  all  2  ^-player  game  graphs  and  all 
objectives  <h,  we  have  ((l^sm-e^)  ^  {{l))aimost (3>)  ^  ((l))j,os(^>).  Computing  sure,  almost-sure  and 
positive  winning  sets  and  strategies  is  referred  to  as  the  qualitative  analysis  of  21/2-player  games 

[n]-  t 

Given  w-regular  objectives  <h  C  J7  for  player  1  and  17\<h  for  player  2,  we  define  the  value  functions 
((tyval  and  ((2j)vai  for  the  players  1  and  2,  respectively,  as  the  following  functions  from  the  state 
space  S  to  the  interval  [0, 1]  of  reals:  for  all  states  s  €  S,  let  ((l))„a;(<l>)(s)  =  sup^^  inf^gn  Prg,7r(<h) 
and  ((2))va;(I7\<h)(s)  =  sup^en  inf^gs  Pr^,7r(I7\<h).  In  other  words,  the  value  ((l))„az(<I?)(s)  gives  the 
maximal  probability  with  which  player  1  can  achieve  her  objective  <J>  from  state  s ,  and  analogously 
for  player  2.  The  strategies  that  achieve  the  value  are  called  optimal:  a  strategy  a  for  player  1 
is  optimal  from  the  state  s  for  the  objective  <h  if  ((l))„a/(<h)(s)  =  inf^gn  Pr^,7r(<h).  The  optimal 
strategies  for  player  2  are  defined  analogously.  Computing  values  and  optimal  strategies  is  referred 
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to  as  the  quantitative  analysis  of  21/2-player  games.  The  set  of  states  with  value  1  is  called  the 
limit-sure  winning  set  [11].  For  21/2-player  game  graphs  with  ^-regular  objectives  the  almost-sure 
and  limit-sure  winning  sets  coincide  [4]. 

Let  C  G  {P,  M,  F,  PM ,  PF}  and  consider  the  family  Tr  C  £  of  special  strategies  for  player  1. 
We  say  that  the  family  £c  suffices  with  respect  to  a  player-1  objective  <J>  on  a  class  Q  of  game 
graphs  for  sure  winning  if  for  every  game  graph  G  G  Q  and  state  s  G  ((l))SWf-e(<l>),  there  is  a  player- 
1  strategy  a  G  £c  such  that  for  every  player-2  strategy  tt  G  II,  we  have  Outcome(s,  a,  it)  C  <J>. 
Similarly,  the  family  £c  suffices  with  respect  to  the  objective  $  on  the  class  Q  of  game  graphs  for 
(a)  almost-sure  winning  if  for  every  game  graph  G  G  Q  and  state  s  G  ((1 )) almost  {&),  there  is  a  player- 
1  strategy  a  G  £c  such  that  for  every  player-2  strategy  it  G  II,  we  have  Pr^,7r(4>)  =  1;  (b)  positive 
winning  if  for  every  game  graph  G  G  Q  and  state  s  G  ((l))pos(<J>),  there  is  a  player-1  strategy 
a  G  £c  such  that  for  every  player-2  strategy  tt  G  II,  we  have  Pr^,7r(4>)  >  0;  and  (c)  optimality 
if  for  every  game  graph  G  G  Q  and  state  s  G  S,  there  is  a  player-1  strategy  a  G  £c  such  that 
«1  ))val(.$)(s)  =  inf  Trgn  Prg,7r  (<!>).  The  notion  of  sufficiency  for  size  of  finite-memory  strategies  is 
obtained  by  referring  to  the  size  of  the  memory  M  of  the  strategies.  The  notions  of  sufficiency  of 
strategies  for  player  2  is  defined  analogously. 

Determinacy.  For  sure  winning,  the  1  ^-player  and  2  ^-player  games  coincide  with  2-player 
(deterministic)  games  where  the  random  player  (who  chooses  the  successor  at  the  probabilistic 
states)  is  interpreted  as  an  adversary,  i.e.,  as  player  2.  Theorem  1  and  Theorem  2  state  the 
classical  determinacy  results  for  2-player  and  2  '/^-player  game  graphs  with  Miiller  objectives.  It 
follows  from  Theorem  2  that  for  all  Muller  objectives  4>,  for  all  e  >  0,  there  exists  an  e-optimal 
strategy  a£  for  player  1  such  that  for  all  n  and  all  s  G  S  we  have  Prg,7r(4>)  >  ((l))^a;(4>)(s)  —  e. 

Theorem  1  (Qualitative  determinacy  [14])  For  all  2-player  game  graphs  and  Muller  objectives 
<L,  we  have  (( l))SUre( $)  H  ((2))sure(Q  \  <L)  =  0  and  ((l))sure($)  U  ((2})sure(£l  \  4>)  =  S.  Moreover,  on 
2-player  game  graphs,  the  family  of  pure  finite-memory  strategies  suffices  for  sure  winning  with 
respect  to  Muller  objectives. 

Theorem  2  (Quantitative  determinacy  [18])  For  all  21/2-player  game  graphs,  for  all  Muller 
winning  conditions  P  C  V(C),  and  all  states  s,  we  have  ((1  ))vai{Muller(P))(s)  +  ((2 )}vai(£l  \ 
Muller{P))(s)  =  1. 

3  Optimal  Memory  Bound  for  Pure  Qualitative  Winning  Strate¬ 
gies 

In  this  section  we  present  optimal  memory  bounds  for  pure  strategies  with  respect  to  qualitative 
(almost-sure  and  positive)  winning  for  2  ^-player  game  graphs  with  Muller  winning  conditions.  The 
result  is  obtained  by  a  generalization  of  the  result  of  [10]  and  depends  on  the  novel  constructions 
of  Zielonka  [25]  for  2-player  games.  In  [10]  the  authors  use  an  insightful  analysis  of  Zielonka’s 
construction  to  present  an  upper  bound  (and  also  a  matching  lower  bound)  on  memory  of  sure 
winning  strategies  in  2-player  games  with  Muller  objectives.  In  this  section  we  generalize  the  result 
of  [10]  to  show  that  the  same  upper  bound  holds  for  qualitative  winning  strategies  in  21/2-player 
games  with  Muller  objectives.  We  now  introduce  some  notations  and  the  Zielonka  tree  of  a  Muller 
condition. 

Notation.  Let  P  C  V(C)  be  a  winning  condition.  For  D  C  C  we  define  (P  \  D)  C  V(D)  as  the 
set  {D'gP\D'C.D}.  For  a  Muller  condition  P  C  V{C)  we  denote  by  P  the  complementary 
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condition,  i.e.,  F  =  V(C)  \  F .  Similarly  for  an  objective  d>  we  denote  by  <f>  the  complementary 
objective,  i.e.,  <1>  =  ki  \  <h. 


Definition  1  (Zielonka  tree  of  a  winning  condition  [25])  The  Zielonka  tree  of  a  winning 
condition  F  C  V(C),  denoted  Zjtq,  is  defined  inductively  as  follows: 

1.  If  C  ft  F,  then  ZT)C  =  Zy  c,  where  F  =  V(C)  \  F. 

2.  If  C  G  IF,  then  the  root  of  ZjrC  is  labeled  with  C.  Let  Cq,C\,  . . .  ,Ck-i  be  all  the  maximal 
sets  in  {  X  ^  IF  \  X  C  C  } .  Then  we  attach  to  the  root,  as  its  subtrees,  the  Zielonka  trees  of 
IF  f  C\,  i.e.,  ZjrlCitCi!  for  i  =  0, 1, . . . ,  k  -  1. 

Hence  the  Zielonka  tree  is  a  tree  with  nodes  labeled  by  sets  of  colors.  A  node  of  Z^fl  is  a  0-level 
node  if  it  is  labeled  with  a  set  from  F ,  otherwise  it  is  a  1-level  node.  In  the  sequel  we  write  Z fi  to 
denote  Zjr  C  if  C  is  clear  from  the  context.  I 


Definition  2  (The  number  of  Zielonka  tree)  Let  F  C  V{C)  be  a  winning  condition  and 
Zt0,c0,  -2^i,Ci>  •  •  • ,  Zj:k  xpk-\  be  bhe  subtrees  attached  to  the  root  of  the  tree  Zp^,  where  IF,  =  F  \ 
Ci  C  V(Ci)  for  i  =  0, 1, . . . ,  k  —  1.  We  define  the  number  mjr  inductively  as  follows 
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1  if  Zjrfi  does  not  have  any  subtrees, 

<  max{  m^-0i ,  rrtjr, , . . . ,  m^k_1  }  if  C  fL  F ',  ( 1-level  node ) 

rnF  C  G  F,  ( 0-level  node).  I 


Our  goal  is  to  show  that  for  winning  conditions  F  pure  finite-memory  qualitative  winning 
strategies  of  size  mjr  exist  in  21/2-player  games.  This  proves  the  upper  bound.  The  results  of  [10] 
already  established  the  matching  lower  bound  for  2-player  games.  This  establishes  the  optimal 
bound  of  memory  of  qualitative  winning  strategies  for  21/2-player  games.  We  start  with  the  key 
notion  of  attractors  that  will  be  crucial  in  our  proofs. 


Definition  3  (Attractors)  Given  a  2  !/2 -player  game  graph  G  and  a  set  U  C  S  of  states,  such 
that  G  \  U  is  a  subgame,  and  T  C  S  we  define  Attr1;Q(T,  U )  as  follows: 

To  =  T  D  U;  and  for  j  >  0  we  define  Tj+\  from  Tj  as 
Tj+ 1  =  Tj  U  {  s  G  (Si  U  Sq)  n  U  I  E(s)  n  Tj  +  0  }  U  {seS2nU\  E(s)  nu  CTj  }. 

and  A  =  Attr  i,q(T,  U )  =  Uj>o  Tj-  We  obtain  Attr2io(^)  U)  by  exchanging  the  roles  of  player  1  and 
player  2.  A  pure  memoryless  attractor  strategy  aA  :  (A  \  T)  n  5i  — »  S  for  player  1  on  A  to  T  is  as 
follows:  for  i  >  0  and  a  state  s  G  (Tj  \  T,;_i)  D  Si,  the  strategy  <yA(s)  G  Tj_i  chooses  a  successor  in 
Tj_i  (which  exists  by  definition).  I 

Lemma  1  (Attractor  properties)  Let  G  be  a  2 1/2-player  game  graph  and  U  C  S  be  a  set  of 
states  such  that  G  \  JJ  is  a  subgame.  For  a  set  T  C  S  of  states,  let  Z  =  Attr qQ(T,  [/).  Then  the 
following  assertions  hold. 

1.  G  \  (U  \  Z)  is  a  subgame. 

2.  Let  az  be  a  pure  memoryless  attractor  strategy  for  player  1.  For  all  strategies  it  for  player  2 
in  the  subgame  G  f  U  and  for  all  states  s  G  U  we  have 
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(a)  if  Pi's  ,7t (Reach(Z))  >  0,  then  Pr[T  ,n (Reach(T))  >  0;  and 
(h)  if  PigZ ’n (Biichi(Z))  >  0;  then  Pr f (Biichi(T)  \  Biichi{Z ))  =  1. 

Proof.  We  prove  the  following  cases. 

1.  Subgame  property.  For  a  state  s  G  U\Z,  if  s  G  SiUSq,  then  E(s)DZ  =  0,  (otherwise  s  would 
have  been  in  Z),  i.e.,  E(s)DU  C  U\Z.  For  a  state  s  G  S2C\(U\Z )  we  have  E(s)n(U\Z)  ^  0 
(otherwise  s  would  have  been  in  Z).  It  follows  that  G  \  (U  \  Z)  is  a  subgame. 

2.  We  now  prove  the  two  cases. 

(a)  Positive  probability  reachability.  Let 

<5min  =  min{  5(s){t)  \  s  G  Sq,  t  G  S,  5(s)(t)  >  0  }. 

Observe  that  (5mjn  >  0.  Let  Z  =  (J?>0  with  Tq  =  T;  (as  defined  for  attractors). 
Consider  a  strategy  cf q  of  both  player  1  and  the  random  player  on  Z  as  follows: 
player  1  follows  an  attractor  strategy  az  on  Z  to  T  and  for  s  G  (T)  \  T*_ i)  D  Sq,  the 
random  player  chooses  a  successor  t  G  T)_i.  Such  a  successor  exists  by  definition,  and 
observe  that  such  a  choice  is  made  in  the  game  with  probability  at  least  dmin.  The 
strategy  o^q  ensures  that  for  all  states  s  G  Z  and  for  all  strategies  ir  for  player  2  in 
G  \  U,  the  set  T  n  U  is  reached  with  in  |Z|-steps.  Given  player  1  follows  an  attractor 
strategy  oz ,  the  probability  of  the  choice  of  erf  q  is  at  least  b\f}n  ■  It  follows  that  a  pure 
memoryless  attractor  strategy  az  ensures  that  for  all  states  s  G  Z  and  for  all  strategies 
7T  for  player  2  in  G  \  U  we  have 

Prf  ’-(Reach(r))  >  («5min)izl  >  0. 

The  desired  result  follows. 

(b)  Almost- sure  Biichi  property.  Given  a  pure  memory  less  attractor  strategy  az ,  if  the  set 
Z  is  visited  f'-times,  then  by  the  previous  part  we  have  that  T  is  reached  at  least  once 
with  probability  1  —  (1  —  |Jmin|^)g,  which  goes  to  1  as  t  — »  oo.  Hence  for  all  states 
s  and  strategies  ir  in  G  \  U,  given  PrJZ,7r(Btichi(Z))  >  0,  we  have  Pr^Z,7r (Reach (T)  | 
Biichi(Z))  =  1.  Since  given  the  event  that  Z  is  visited  infinitely  often  (i.e.,  Biichi(Z)) 
the  set  T  is  reached  with  probability  1  from  all  states,  it  follows  that  the  set  T  is  visited 
infinitely  often  with  probability  1.  Formally,  for  all  states  s  and  strategies  it  in  G  \  U, 
given  Pr^Z,7r(Buchi(Z))  >  0,  we  have  Pr^Z,7r(Biichi(T)  |  Biichi (Z))  =  1. 

The  result  of  the  lemma  follows.  I 

Lemma  1  shows  that  the  complement  of  an  attractor  is  a  subgame;  and  a  pure  memoryless 
attractor  strategy  ensures  that  if  the  attractor  of  a  set  T  is  reached  with  positive  probability,  then 
T  is  reached  with  positive  probability,  and  given  that  the  attractor  of  T  is  visited  infinitely  often, 
then  T  is  visited  infinitely  often  with  probability  1.  We  now  present  the  main  result  of  this  section 
(upper  bound  on  memory  for  qualitative  winning  strategies).  A  matching  lower  bound  follows  from 
the  results  of  [10]  for  2-player  games  (see  Theorem  4). 

Theorem  3  (Qualitative  forgetful  determinacy)  Let  (G,C,x,d-)  be  a  21 /2-player  game  with 
Muller  winning  condition  T  for  player  1.  Let  4?  =  Muller(E) ,  and  consider  the  following  sets 

Wl>0  =  ((1  >W£);  Wl  =  ((!)) almost  (£); 

W>°  =  ((2)) 

pos  ($);  W2  =  ((2))  almost  (^=0  • 

The  following  assertions  hold. 
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1.  We  have  (a)  W^°  U  W2  =  S  and  Wf°  n  W2  =  0;  and  (b)  W2>0  U  Wi  =  S  and  W2>0  nWi  =  ®. 


2.  (a)  Player  1  has  a  pure  strategy  a  with  memory  of  size  my -  such  that  for  all  states  s  £ 

and  for  all  strategies  tt  for  player  2  we  have  Pr^,7r(<l>)  >  0;  and  (b)  player  2  has  a  pure  strategy 
7 r  with  memory  of  size  my  such  that  for  all  states  s  £  W2  and  for  all  strategies  a  for  player  1 
we  have  Ppf,7r(<I>)  =  1. 

3.  (a)  Player  1  has  a  pure  strategy  a  with  memory  of  size  my?  such  that  for  all  states  s  £  W\  and 
for  all  strategies  it  for  player  2  we  have  Pr^,7r(<!>)  =  1;  and  (b)  player  2  has  a  pure  strategy  n 
with  memory  of  size  my  such  that  for  all  states  s  £  W^°  and  for  all  strategies  a  for  player  1 
we  have  Pp!’7r(<I>)  >  0. 

Proof.  The  first  part  of  the  result  is  a  consequence  of  Theorem  2.  We  will  concentrate  on  the 
proof  for  the  result  for  part  2.  The  last  part  (part  3)  follows  from  a  symmetric  argument. 

The  proof  goes  by  induction  on  the  structure  of  the  Zielonka  tree  Zyc  °f  the  winning  condition 
J- .  We  assume  that  C  fL  T .  The  case  when  C  £  T  can  be  proved  by  a  similar  argument:  if  C  £  J-, 
then  we  consider  c  0  C  and  consider  the  winning  condition  T  =  J-  C  V(C U{c})  with  Cu{c}  fL  J~ . 
Hence  we  consider,  without  loss  of  generality,  that  C  fL  T  and  let  Cq,  C\, . . . ,  Ck-i  be  the  label  of 
the  subtrees  attached  to  the  root  C,  i.e. ,  Co,  C\, . . . ,  C^-i  are  maximal  subset  of  colors  that  appear 
in  T .  We  will  define  by  induction  a  non-decreasing  sequence  of  sets  (Uj)j> 0  as  follows.  Let  Uq  =  0 
and  for  j  >  0  we  define  Uj  below: 

1.  Aj  =  Attri!Q(C/j_i,  S )  and  Xj  =  S\  Ay, 

2.  Dj  =  C\  Cj  mod  k  and  Yj  =  Xj  \  Attr2,o Xj)- 

3.  let  Zj  be  the  set  of  positive  winning  states  for  player  1  in  (G  \  Yj,Cj  mod  k,Xi  F  \  Cj  mod  k), 
(i.e.,  Zj  =  (( 1 )) pos (Miiller(JA  (  Cj  mod  k))  in  G  (  Yj)]  hence  (Yj  \  Zj)  is  almost-sure  winning 
for  player  2  in  the  subgame;  and 

4.  Uj  =  Aj  U  Zj. 

Fig  1  describes  all  these  sets.  The  property  of  attractors  and  almost-sure  winning  states  ensure 
certain  edges  are  forbidden  between  the  sets.  This  is  shown  is  Fig  2.  We  start  with  a  few 
observations  of  the  construction. 

1.  Observation  1.  For  all  s  £  S2  n  Zj,  we  have  E(s)  C  Zj  U  Aj.  This  follows  from  the  following 
case  analysis. 

•  Since  Yj  is  a  complement  of  an  attractor  set  Attr 2, o(x~1(E>j),Xj),  it  follows  that  for  all 
states  s  £  S2  fl  Y]  we  have  E(s)  H  Xj  C  Yj.  It  follows  that  E(s)  ChjU  Aj. 

•  Since  player  2  can  win  almost-surely  from  the  set  Yj  \  Zj ,  if  a  state  s  £  Yj  fl  S2  has  an 
edge  to  Yj  \  Zj,  then  s  £  Yj  \  Zj.  Hence  for  s  £  S2  fl  Zj  we  have  E(s)  D  (Yj  \  Zj)  =  0. 

2.  Observation  2.  For  all  s  £  Xj  D  (S\  U  Sq )  we  have  (a)  E(s)  fl  Aj  =  0;  else  s  would  have  been 
in  Aj]  and  (b)  if  s  £  Yj  \  Zj,  then  E(s)  fl  Zj  =  0  (else  s  would  have  been  in  Zj). 


3.  Observation  3.  For  all  s  £Yj  n  Sq  we  have  E(s)  C  Yj. 


Figure  1:  The  sets  of  the  construction. 


We  will  denote  by  J-,  the  winning  condition  T  \  Ci,  for  i  =  0, 1, . . . ,  k  —  1,  and  Ti  =  V{Ci )  \ 

By  induction  hypothesis  on  kF%  =  J-  \  Cj  modfc,  player  1  has  a  pure  positive  winning  strategy  of 
size  m yi  from  Zj  and  player  2  has  a  pure  almost-sure  winning  strategy  of  size  my  from  Yj  \  Zj. 
Let  W  =  Uj>o  Uj-  We  will  show  in  Lemma  2  that  player  1  has  a  pure  positive  winning  strategy  of 
size  my  from  W ;  and  then  in  Lemma  3  we  will  show  that  player  2  has  a  pure  almost-sure  winning 
strategy  of  size  my  from  S  \  W.  This  completes  the  proof.  We  now  prove  the  Lemmas  2  and  3.  I 

Lemma  2  Player  1  has  a  pure  positive  winning  strategy  of  size  m y  from  the  set  W. 

Proof.  By  induction  hypothesis  on  j  player  1  has  a  pure  positive  winning  strategy  a^_  1  of  size  my 
from  Uj-i.  From  the  set  Aj  =  Attri!Q(LrJ_i,  S),  player  1  has  a  pure  memoryless  attractor  strategy 
cr4  to  bring  the  game  to  Uj- 1  with  positive  probability  (Lemma  l(part  2. (a))),  and  then  use  crf(_1 
and  ensure  winning  with  positive  probability  from  the  set  Aj.  Let  oj  be  the  pure  positive  winning 
strategy  for  player  1  in  Zj  of  size  my ,  where  i  =  j  mod  k.  We  now  show  the  combination  of 
strategies  a^_1,  af  and  Oj  ensure  positive  probability  winning  for  player  1  from  Uj.  If  the  play 
starts  at  a  state  s  G  Zj,  then  player  1  follows  o'-  .  If  the  play  stays  in  Yj  for  ever,  then  the  strategy 
a'j  ensures  that  player  1  wins  with  positive  probability.  By  observation  1  of  Theorem  3,  for  all 
states  s  &  Yj  H  S-2,  we  have  E(s)  C^U  Aj.  Hence  if  the  play  leaves  Yj,  then  player  2  must  chose 
an  edge  to  Aj.  In  Aj  player  1  can  use  the  attractor  strategy  cr4  followed  by  or^1  to  ensure  positive 
probability  win.  Hence  if  the  play  is  in  Yj  for  ever  with  probability  1,  then  off  ensures  positive 
probability  win,  and  if  the  play  reaches  Aj  with  positive  probability,  then  cr4  followed  by  crj(_1 
ensures  positive  probability  win. 

We  now  formally  present  a1-  defined  on  Uj.  Let  off  =  (o^u,  o ?  )  be  the  strategy  obtained  from 
inductive  hypothesis;  defined  on  Zj  (i.e.,  arbitrary  elsewhere)  of  size  my ,  where  i  =  j  mod  k,  and 
ensure  winning  with  positive  probability  on  Zj.  Let  o?u  be  the  memory-update  function  and  o?m 
be  the  next-move  function  of  crj.  We  assume  the  memory  My  of  a ?  to  be  the  set  {1,2,...,  rriy  }. 
The  strategy  cr4  :  (Aj  \  Uj- 1)  fl  Si  — >  Aj  is  a  pure  memoryless  attractor  strategy  on  Aj  to  Uj-\. 
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Figure  2:  The  sets  of  the  construction  with  forbidden  edges. 


The  strategy  olj  is  as  follows:  the  memory-update  function  is 


s  G  Uj— i 

s  G  Zj ,  m  G  Mjr. 

1 

otherwise. 

the  next-move  function  is 

cf-i ,m(s,  m)  s  G  Uj-i  n  5i 

of_pm(s,  m)  s  e  Zj  n  51!,  m  G  Mr. 

!)  s  G  n  Si,  m  fL  Mjr. 

af(s)  s  G  (j4j  \  Uj-i)  fl  S\. 

The  strategy  alj  formally  defines  the  strategy  we  described  and  proves  the  result.  I 

Lemma  3  Player  2  has  a  pure  almost- sure  winning  strategy  of  size  my  from  the  set  S\W . 

Proof.  Let  i  G  N  be  such  that  t  mod  k  =  0  and  IF  =  Ug_\  =  Up  =  Ug+\  =  •  •  •  =  Ug+k-\.  From 
the  equality  W  =  Ut-i  =  Ug  we  have  Attri!Q(hF,  S)  =  W.  Let  us  denote  by  W  =  S  \  W.  Hence 
G  \  W  is  a  subgame  (by  Lemma  1),  and  also  for  all  s  G  W  D  (S\  U  Sq)  we  have  E(s)  C  W.  The 
equality  Ug+i-\  =  Ug+l  implies  that  Zg+i  =  0.  Hence  for  all  i  =  0, 1, . . . ,  k  —  1,  we  have  Zg+i  =  0. 
By  inductive  hypothesis  for  all  i  =  0, 1, . . .  ,  k  —  1,  player  2  has  a  pure  almost-sure  winning  strategy 
7 t1  of  size  my.  in  the  game  ( G  f  Yg+i,  Ci,  x,  F  \  Cf). 

We  now  describe  the  construction  of  a  pure  almost-sure  winning  strategy  n*  for  player  2  in  W . 
For  Di  =  C\  Ci  we  denote  by  D{  =  x_1(Dj)  the  set  of  states  with  colors  Dj.  If  the  play  starts  in  a 

state  in  Yg+i,  for  i  =  0, 1, . . . ,  k  —  1,  then  player  2  uses  the  almost-sure  winning  strategy  7r*.  If  the 

play  leaves  Yg+i ,  then  the  play  must  reach  W  \  Yg+i  =  Att^o (Di,  IF),  since  player  1  and  random 
states  do  not  have  edges  to  IF.  In  Attr2io(-^)*)  IF),  player  2  plays  a  pure  memoryless  attractor 
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strategy  to  reach  the  set  Di  with  positive  probability.  If  the  set  Di  is  reached,  then  a  state  in 
Y(i+i+ 1)  mod  k  or  in  Attr2)0(j3(i+1)  mod  k,W)  is  reached.  If  Y{e+i+1)  mod  k  is  reached  ttB+P  modk 
is  followed,  and  otherwise  the  pure  memoryless  attractor  strategy  to  reach  the  set  D(i+\)  mod  k 
with  positive  probability  is  followed.  Of  course,  the  play  may  leave  1W i+1)  mod  and  reach 
Ya+i+ 2)  mod  ki  and  then  we  would  repeat  the  reasoning,  and  so  on.  Let  us  analyze  various  cases  to 
prove  that  it*  is  almost-sure  winning  for  player  2. 

1.  If  the  play  finally  settles  in  some  Yp+i,  for  i  =  0, 1, . . . ,  k  —  1,  then  from  this  moment  player  2 
follows  7 r*  and  ensures  that  the  objective  is  satisfied  with  probability  1.  Formally,  for  all 
states  s  £  W,  for  all  strategies  a  for  player  1  we  have  Pr^,7r  (<f>  |  coBuchi(Y^+j))  =  1.  This 
holds  for  alii  =  0, 1, . . . ,  k  —  1  and  hence  for  all  states  s  £  W,  for  all  strategies  a  for  player  1 
we  have  P<’7r*(<F  |  U0<i<fc— i  coBiichi(Y^+i))  =  1. 

2.  Otherwise,  for  all  i  =  0, 1, . . . ,  k  —  1,  the  set  W  \  Y)+i  =  Attr2,o (-D*,  W)  is  visited  infinitely 
often.  By  Lemma  1,  given  Attr 2,o(Di,  VP)  is  visited  infinitely  often,  then  the  attractor  strategy 
ensures  that  the  set  Di  is  visited  infinitely  often  with  probability  1.  Formally,  for  all  states 
s  £  W,  for  all  strategies  a  for  player  1,  for  all  i  =  0, 1, . . « :i  k  —  1,  we  have  Pr^,ir  (Biichi(Dj)  | 
Biichi(VF  \  Yp+i))  =  1;  and  also  Pr^’71"  (Biichi(L)j)  j  D0<j<fe_1  Biichi(hF  \  Yg+i))  =  1.  It  follows 
that  for  all  states  s  £  W,  for  all  strategies  a  for  player  1  we  have  Pr^,ir  (flo<i<fc-i  Biichi(Dj)  | 
n0<,<fc-iBhchi(lT\y,+i))  =  1.  Hence  the  play  visits  states  with  colors  not  in  Ci  with 
probability  1.  Hence  the  set  of  colors  visited  infinitely  often  is  not  contained  in  any  Q.  Since 
Co,  C\, ,  Ck~ i  are  all  the  maximal  subsets  of  J- .  we  have  the  set  of  colors  visited  infinitely 
often  is  not  in  T  with  probability  1,  and  hence  player  2  wins  almost-surely. 

Hence  it  follows  that  for  all  strategies  a  and  for  all  states  s  €  (S  \  W )  we  have  Pr^’17  (T)  =  1. 
To  complete  the  proof  we  present  precise  description  of  the  strategy  ir*  with  memory  of  size  m^p. 
Let  7T*  =  be  an  almost-sure  winning  strategy  for  player  2  for  the  subgame  on  Yp+l  with 

memory  My,.  By  definition  we  have  my  =  JJi=o  mY ■■  Bet  My  =  \Ji=Q(My.  x  {  i  }).  This  set  is 
not  exactly  the  set  {1,2,...,  mjr  },  but  has  the  same  cardinality  (which  suffices  for  our  purpose). 
We  define  the  strategy  n*  as  follows: 


7ru(s,  (m,  i)) 


))  s  £  Yl+i 

(l,i  +  1  mod  k )  otherwise. 


t C(s,(m,i)) 


' Cm(s,(m,i))  s  £  Yp+i 
<  nLi  (s)  s  £  Lj,  \  Di 

„  Si  s  £  Di,  Si  £  E(s )  D  W . 


where  Li  =  Attr2iQ(Dj,  W);  irLi  is  a  pure  memoryless  attractor  strategy  on  Li  to  Di,  and  Si  is  a 
successor  state  of  s  in  W  (such  a  state  exists  since  W  induces  a  subgame).  This  formally  represents 
7r*  and  the  size  of  ir*  satisfies  the  required  bound.  Observe  that  the  disjoint  sum  of  all  My  was 
required  since  Yp,  Yp+1, . . . ,  Yp+k_i  may  not  be  disjoint  and  the  strategy  ir*  need  to  know  which  Y) 
the  play  is  in.  I 

Lower  bound.  In  [10]  the  authors  show  a  matching  lower  bound  for  sure  winning  strategies  in 
2-player  games.  It  may  be  noted  that  in  2-player  games  any  pure  almost-sure  winning  or  any  pure 
positive  winning  strategy  is  also  a  sure  winning  strategy.  This  observation  along  with  the  result 
of  [10]  gives  us  the  following  result. 
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Algorithm  1  MullerQualitativeWithoutC 


Input:  A  21/2-player  game  graph  G,  a  Muller  objective  Muller  (F)  for  player  1, 
with  F  C  V(C)  and  C  0  F. 

Output:  li'j  and  lib . 

1.  Let  Co,  C\, ... ,  Ck-\  be  the  maximal  sets  that  appear  in  F. 

2.  t/0  =  0;  j  =  0;  C°  =  G; 

3.  do  { 

3.1  Dj  =  C\  Cj  mod  k; 

3.2Yj  =  Si\Attr2,0(X-1(Dj),Siy, 

3.3  (A\,A^)  =  MiillerQualitativeWithC  (GJ  [  Y),F  [  Cj  modfe); 

3.4  if  (A{  ±  0) 

3.4.1  Uj+ 1  =  Uj  U  Attrii0(Cj  U  A{,Sj); 

3A.2&+1  =  G\(S\Uj+1yi 

3.5  j=j  +  1; 

}  while  {j  <k\J  -i (j  mod  k  =  0  A  j  >  k  A  V*.  j  —  k  <  i  <  j.  A\  =  0)); 

4.  return  (IF, ,  W2)  =  ( Uj,S  \  Uj). 


Theorem  4  (Lower  bound  [10])  For  all  Muller  winning  conditions  F  C  V(C),  there  is  a  2- 
player  game  (G,  C,  y,-?7)  (with  a  2-player  game  graph  G)  such  that  every  pure  almost-sure  and 
positive  winning  strategy  for  player  1  requires  memory  of  size  at  least  mjr;  and  every  pure  almost- 
sure  and  positive  winning  strategy  for  player  2  requires  memory  of  size  at  least  my. 

3.1  Complexity  for  qualitative  analysis 

We  now  present  algorithms  to  compute  the  almost-sure  and  positive  winning  states  for  Muller 
objectives  Miiller(F)  in  21/2-player  games.  We  will  consider  two  cases:  the  case  when  G  €  T  and 
when  G  fL  T.  We  present  the  algorithm  for  the  later  case  (which  recursively  calls  the  former  case). 
Once  the  algorithm  for  the  later  case  is  obtained,  we  show  how  the  algorithm  can  be  iteratively 
used  to  solve  the  former  case. 

Informal  description  of  the  algorithm.  We  present  an  algorithm  to  compute  the  positive  win¬ 
ning  sets  for  player  1  and  the  almost-sure  winning  sets  for  player  2  for  Muller  objectives  Miiller(F) 
for  player  1  in  2  ^-player  game  graphs.  We  consider  the  case  with  G  fL  T  and  refer  to  this  algo¬ 
rithm  as  MullerQualitativeWithoutC  and  the  case  when  G  €  T  we  refer  to  the  algorithm  as 
MiillerQualitativeWithC.  The  algorithm  proceeds  iteratively  removing  positive  winning  sets  for 
player  1:  at  iteration  j  the  game  graph  is  denoted  as  GJ  and  the  set  of  states  as  SU  The  algorithm 
is  described  as  Algorithm  1. 

Correctness.  If  W\  and  W2  are  outputs  of  Algorithm  1,  then  ILj  =  ((l))pos  (Muller  (F))  and 
W2  =  ((2)) cdm0st  (Miiller(F)).  The  correctness  follows  from  the  correctness  arguments  of  Theorem  3. 
We  now  present  an  algorithm  to  compute  the  almost-sure  winning  states  (( 1 ))  almost  (Muller  (F) )  for 
player  1  and  positive  winning  states  ((2)) pos  (Muller (F))  for  player  2  for  Muller  objectives  Miiller(F) 
with  G  0  F.  Once  we  present  this  algorithm,  it  is  easy  to  exchange  the  roles  of  the  players  to 
obtain  the  algorithm  MiillerQualitativeWithC .  The  algorithm  to  compute  almost-sure  winning 
states  for  player  1  for  Muller  objectives  Miiller(F)  with  G  fL  F  proceeds  as  follows:  the  algorithm 
iteratively  uses  MullerQualitativeWithoutC  and  runs  for  atmost  S  iterations.  At  iteration  i 
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Algorithm  2  MiillerQualitativeWithoutCIterative 


Input:  A  21/2-player  game  graph  G,  a  Muller  objective  Muller  (JF)  for  player  1, 
with  T  C  V(C)  and  C  0  T. 

Output:  W]  and  lib . 

1.  Let  Co,  C\, ... ,  Ck~ i  be  the  maximal  sets  that  appear  in  T . 

2.  A0  =  0;  j  =  0;  G°  =  G; 

3.  do  { 

3.1  (A{,Aj2)  =  MullerQualitativeWithoutC(GJ,  E)\ 

3.2  if  (A32  /  0); 

3.2.1  Xj+i  =  Xj  U  Attr2)0 (Xj  U  A32,  S° ); 

3.2.2  Gj+1  =  G  \  (S\Xj+i); 

3.5./-./  +  1; 

}  while  (A^1  /  0); 

4.  return  (IL'j ,  W2)  =  (5  \  Xj,  Xj). 


the  algorithm  computes  the  almost-sure  winning  set  A^  for  player  2  in  the  present  sub-game  GJ, 
and  the  set  of  states  such  that  player  2  can  reach  A2  with  positive  probability.  The  above  set  is 
removed  from  the  game  graph,  and  the  algorithm  iterates  on  a  smaller  game  graph.  The  algorithm 
is  formally  described  as  Algorithm  2. 

Correctness.  Let  W\  and  1+2  be  the  output  of  Algorithm  2,  then  W\  =  {(1)) almost  (Muller (J7)) 
and  1+2  =  ((2)} pos (Muller (E)).  It  is  clear  that  1+2  Q  ((2))pos (Muller(JF)).  We  now  argue  that  W\  = 
(( 1 )} almost,  ( Muller (JF) )  to  complete  the  correctness  arguments.  When  the  algorithm  terminates,  let 
the  game  graph  by  GJ ,  and  we  have  A2  =  0.  Then  in  GJ ,  player  1  wins  with  positive  probability 
from  all  states.  Since  Muller  objectives  are  tail  objectives  (independent  of  finite  prefixes  of  plays), 
it  follows  from  the  results  of  [2]  that  if  a  player  wins  in  a  game  with  positive  probability  from  all 
states  for  a  Muller  objective,  then  the  player  wins  with  value  1  from  all  states.  It  follows  that 
W\  =  (( 1 )) almost  ( Muller (JF) ) .  The  correctness  follows. 

Time  and  space  complexity.  We  now  argue  that  the  space  requirement  for  the  algorithms  are 
polynomial.  Let  us  denote  the  space  recurrence  of  Algorithm  1  as  S(n,c )  for  game  graphs  with 
n  states  and  Muller  objectives  Miiller(.F')  with  c  colors  (i.e. ,  T  C  V(C)  with  |C|  =  c).  Then  the 
recurrence  satisfies  that  S(n,c)  =  0(n )  +  S(n,c  —  1)  =  0(n  ■  c).  The  recurrence  requires  space 
for  recursive  calls  with  at  least  one  less  color  (denoted  by  S(n,c  —  1)),  and  0{n )  space  for  the 
computation  of  the  loop  of  the  algorithm.  This  gives  a  PSPACE  upper  bound,  and  a  matching 
lower  bound  (of  PSPACE-hardness)  for  the  special  case  of  2-player  game  graphs  is  given  in  [15]. 

Theorem  5  (Algorithm  and  complexity)  The  following  assertions  hold. 

1.  Given  a  game  (G,  G,  y,.F)  Algorithm  1  and  Algorithm  2  computes  an  almost-sure  winning 
strategy  and  the  almost-sure  winning  sets  in  0((|5|  +  |£j)  •  d)h+1 )  time  and  0(|5|  •  |G|)  space; 
where  d  is  the  maximum  degree  of  a  node  and  h  is  the  height  of  the  Zielonka  tree  Z?. 

2.  Given  a  game  (G,  C,x,  Z)  and  a  state  s,  it  is  PS  PACE- complete  to  decide  whether  s  € 

(( 1 ))  almost  ( Miiller(iF) ) . 
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4  Optimal  Memory  Bound  for  Pure  Optimal  Strategies 

In  this  section  we  extend  the  sufficiency  results  for  families  of  strategies  from  almost-sure  winning 
to  optimality  with  respect  to  all  Miiller  objectives.  In  the  following,  we  fix  a  21/2-player  game 
graph  G.  We  first  present  a  useful  proposition  and  then  some  definitions.  Since  Muller  objectives 
are  infinitary  objectives  (independent  of  finite  prefixes)  the  following  proposition  is  immediate. 

Proposition  1  (Optimality  conditions)  For  all  Muller  objectives  <h,  for  every  s  £  S  the  fol¬ 
lowing  conditions  hold. 

1.  If  s  £  Si,  then  for  all  t  £  E(s )  we  have  ((1  ))vai(&)(s)  >  ((1  ))vai($)(t),  and  for  some  t  £  E(s) 
we  have  «1  ))vai($)(s)  =  «1  ))vai($)(t). 

2.  If  s  £  S2,  then  for  all  t  £  E(s)  we  have  ((l))m;(d>)(s)  <  {(l))vai($)(t),  and  for  some  t  £  E(s) 
we  have  ((l)}val($)(s)  =  ((1  ))vai($)(t). 

3.  If  s£  Sq,  then  ((1  ))m;($)(s)  =  (  T,teE(s)((1))val(^){t)  •  6{s)(t)). 

Similar  conditions  hold  for  the  value  function  ((2 ))vai(Ii  \  4>)  of  player  2. 

Definition  4  (Value  classes)  Given  a  Muller  objective  4>;  for  every  real  r  £  [0, 1]  the  value  class 
with  value  r  is  VC(<3?,  r)  =  {  s  £  S  \  ({l))vai(&)(s)  =  r  }  is  the  set  of  states  with  value  r  for  player  1. 
For  r  £  [0,1]  we  denote  by  VC(4>,  >  r)  =  U9>r  VC(4>,  q)  the  value  classes  greater  than  r  and  by 
VC(<3?,  <  r)  =  U<j<r  VC(3>,  q)  the  value  classes  smaller  than  r.  I 

Definition  5  (Boundary  probabilistic  states)  Given  a  set  U  of  states,  a  state  s  £  U  T I  Sq  is 
a  boundary  probabilistic  state  for  U  if  E(s)  (~l  (S  \  U)  7^  0,  i.e.,  the  probabilistic  state  has  an  edge 
out  of  the  set  U.  We  denote  by  Bnd(tZ)  the  set  of  boundary  proababilistic  states  for  U .  For  a  value 
class  VC (<h,r)  we  denote  by  Bnd($,r)  the  set  of  boundary  probabilistic  states  of  value  class  r.  I 

Observation.  It  follows  from  Proposition  1  that  for  a  state  s  £  Bnd($,  r)  we  have  E(s)  n  VC(4>,  > 
r)  7^  0  and  E(s)  f~l  VC(4>,  <  r)  7^  0,  i.e.,  the  boundary  probabilistic  states  have  edges  to  higher 
and  lower  value  classes.  It  follows  that  for  all  Muller  objectives  <h  we  have  Bnd(<h,  1)  =  0  and 
Bnd(<h,  0)  =  0. 

Reduction  of  a  value  class.  Given  a  set  U  of  states,  such  that  U  is  h-live,  let  Bnd(t/)  be  the  set 
boundary  probabilistic  states  for  U .  We  denote  by  GBnd(m  the  subgame  G  \  U  where  every  state  in 
Bnd(17)  is  converted  to  an  absorbing  state  (state  with  a  self-loop).  Since  U  is  h-live,  we  have  C^ndic/) 
is  a  subgame.  Given  a  value  class  VC(4>,r),  let  Bnd(4>,r)  be  the  set  of  boundary  probabilistic 
states  in  VC(4>,r).  We  denote  by  Gend($,r)  the  subgame  where  every  boundary  probabilistic  state 
in  Bnd(<h,r)  is  converted  to  an  absorbing  state.  We  denote  by  G$/r  =  GBnd($,r)  \  VC(<h,r):  this  is 
a  subgame  since  every  value  class  is  d-live  by  Proposition  1,  and  <5-closed  as  all  states  in  Bnd(<h,  r) 
are  converted  to  absorbing  states. 

Lemma  4  (Almost-sure  reduction)  Let  G  be  a  21/2-player  game  graph  and  T  C  V(C)  be  a 
Muller  winning  condition.  Let  =  Muller(F) .  For  0  <  r  <  1,  the  following  assertions  hold. 

1.  Player  1  wins  almost-surely  for  objective  U  Reach{ B nd ( <h,  r))  from  all  states  in  G$)r;  i.e., 
((1)) almost U  l?eac/i(Bnd(4>,  r)))  =  VC (4>,r)  in  the  subgame  G<j>ir. 
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2.  Player  2  wins  almost-surely  for  objective  $  U  Reac/i(  Bnd(<I>,  r))  from  all  states  in  i.e., 

((2)) almost ($  U  Reac/i(Bnd(<l>,  r)))  =  VC($,r)  in  the  subgame  G$ir. 

Proof.  We  prove  the  first  part  and  the  second  part  follows  from  symmetric  arguments.  The  result 
is  obtained  through  an  argument  by  contradiction.  Let  0  <  r  <  1,  and  let 

q  =  max{  ((1  ))voZ ($)(*)  I  t  G  E(s)  \  VC (<F,r),s  G  VC(<L,r)  n  Si  }, 

that  is,  q  is  the  maximum  value  a  successor  state  t  of  a  player  1  state  s  G  VC(3>,r)  such  that  the 
successor  state  t  is  not  in  VC(<5,  r).  By  Proposition  1  we  must  have  q  <  r.  Hence  if  player  1  chooses 
to  escape  the  value  class  VC(<f>,r),  then  player  1  gets  to  see  a  state  with  value  at  most  q  <  r.  We 
consider  the  subgame  G$<r.  Let  U  =  VC (<l>,r)  and  Z  =  Bnd(<i>,r).  Assume  towards  contradiction, 
there  exists  a  state  s  G  U  such  that  s  0  ((1)) almost  ($  U  Reach(Z)).  Then  we  have  s  G  (U  \  Z)  and 
((2))vai($  n  Saf e(U  \  Z))(s)  >  0.  It  follows  from  the  results  of  [2]  that  for  all  Muller  objectives  T, 
if  ((2 ))vai(^t)(s)  >  0,  then  for  some  state  si  we  have  ((2)}„a/(’P)(si)  =  1.  Observe  that  in  we 
have  all  states  in  Z  are  absorbing  states,  and  hence  the  objective  d>  fl  Safe(I7  \  Z)  is  equivalent  to 
the  objective  $  n  coBiichi(C7  \  Z),  which  is  a  Muller  objective.  It  follows  that  there  exists  a  state 
si  G  (17  \  Z)  such  that  ((2})„a;(<f>  0  Safe(t/  \  Z))  =  1.  Hence  there  exists  a  strategy  tt  for  player  2 
in  G§tr  such  that  for  all  strategies  a  for  player  1  in  we  have  Pr"’7r(<I)  n  S&iefU  \  Z))  =  1.  We 
will  now  construct  a  strategy  n*  for  player  2  as  a  combination  of  the  strategy  tt  and  a  strategy  in 
the  original  game  G.  By  Martin’s  determinacy  result  (Theorem  2),  for  all  e  >  0,  there  exists  an 
e-optimal  strategy  tt£  for  player  2  in  G  such  that  for  all  s  G  S  and  for  all  strategies  a  for  player  1 
we  have 

Pr^£($)  >«2>W  ($)(*)- £• 

Let  r  —  q  =  a  >  0,  and  let  e  =  ^  and  consider  an  e-optimal  strategy  for  player  2  in  G.  The 
strategy  tt*  in  G  is  constructed  as  follows:  for  a  history  w  that  remains  in  U,  player  2  follows  7 f; 
and  if  the  history  reaches  (S  \U),  then  player  2  follows  the  strategy  ir£.  Formally,  for  a  history 
w  =  (si,  S2,  ■  ■  ■ ,  Sk)  we  have 


7r*(w;) 


tt(w)  if  for  all  1  <  j  <  k.  Sj  G  U : 

7r£(sj,  Sj+i, ...  ,Sk)  where  j  =  min{  i  \  st  U  } 


We  consider  the  case  when  the  play  starts  at  si.  The  strategy  tt*  ensures  the  following:  if  the  game 
stays  in  U.  then  the  strategy  tt  is  followed,  and  given  the  play  stays  in  U,  the  strategy  tt  ensures 
with  probability  1  that  d>  is  satisfied  and  Bnd(<L,r)  is  not  reached.  Hence  if  the  game  escapes  U 
(i.e.,  player  1  chooses  to  escape  U),  then  it  reaches  a  state  with  value  at  most  q  for  player  1.  We 
consider  an  arbitrary  strategy  a  for  player  1  and  consider  the  following  cases. 

1.  If  Pr^7r*(Safe(C/))  =  1,  then  we  have  Pr^’7r*(<I>  n  Saf e(U))  =  Pr^7r(<I>  n  Safe(U))  =  1.  Hence 
we  also  have  Pr^"’7r(<I>)  =  1,  i.e.,  we  have  Pr^’7r*($)  =  0. 

2.  If  Pr^  (Reach(5  \  U))  =  1,  then  the  play  reaches  a  state  with  value  for  player  1  at  most  q 
and  the  strategy  tt£  ensures  that  Pr^71"  (<F)  <  q  +  e. 

3.  If  Pr^’77  (Saf e(U))  >  0  and  Pr^77  (Reach(5\  U))  >  0,  then  we  condition  on  both  these  events 
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and  have  the  following: 

Pr*f»  = 
+ 
< 
< 


Pr£ff*($  |  Safe(C/))  •  Pr^’7r*(Safe(C/)) 

Piff*  ($  |  Reach  (S'  \  U))  ■  Pr£w*  (Reach(S  \  17)) 
0  +  (q  +  s)  ■  Pr^*  (Reach(S  \  U)) 
q  +  s. 


The  above  inequalities  are  obtained  as  follows:  given  the  event  Safe(U),  the  strategy  it*  follows 
7r  and  ensures  that  $  is  satisfied  with  probability  1  (i.e.,  $  is  satisfied  with  probability  0); 
else  the  game  reaches  states  where  the  value  for  player  1  is  at  most  q,  and  then  the  analysis 
is  similar  to  the  previous  case. 

Hence  for  all  strategies  a  we  have 

Pr  °?\<f>)<q  +  £  =  q  +  ?L=r-^. 

Hence  we  must  have  ((l))voi(^)(si)  <  r— f.  Since  a  >  0  and  si  G  VC(4>,r)  (i.e.,  ((l))wji(^)(si)  =  r), 
we  have  a  contradiction.  The  desired  result  follows.  I 


Lemma  5  (Almost-sure  to  optimality  [4])  Let  G  be  a  2l/2-player  game  graph  and  T  C  V(C) 
be  a  Muller  winning  condition.  Let  <£>  =  Muller (IF).  Let  a  be  a  strategy  such  that 

•  a  is  an  almost-sure  winning  strategy  from  the  almost-sure  winning  states  ({{1)} almost  (<h)  in 
G);  and 

•  a  is  an  almost-sure  winning  strategy  for  objective  <f>  U  Reach(Br\d(&,  r))  in  the  game  G for 
all  0  <  r  <  1. 

Then  a  is  an  optimal  strategy. 

Proof.  We  prove  the  result  for  the  case  when  a  is  memoryless  (randomized  memoryless).  The  case 
when  a  is  finite-memory  with  memory  M,  the  arguments  can  be  repeated  on  the  game  GxM  (the 
usual  synchronous  product  of  G  and  the  memory  M). 

Consider  the  player-2  MDP  Ga  with  the  objective  Miiller(7r)  for  player  2.  In  MDPs  with  Muller 
objectives  randomized  memoryless  optimal  strategies  exist  [3].  We  fix  a  randomized  memoryless 
optimal  strategy  tt  for  player  2  in  Ga.  Let  Wj  =  ({1)) almost ($)  and  W2  =  {{2 })aimost{$)-  We  consider 
the  Markov  chain  Ga i7r  and  analyze  the  recurrent  states  of  the  Markov  chain. 

Recurrent  states  in  Ga i7r.  Let  JJ  be  a  closed,  connected  recurrent  set  in  Ga ;7r  (i.e.,  JJ  is  a  bottom 
strongly  connected  component  in  the  graph  of  Ga ;7r).  Let  q  =  max{  r  \  VC(<3?,r)  fl  U  /  0  },  i.e., 
for  all  q'  >  q  we  have  VC(<3?,</)  PI  U  =  0  or  in  other  words  VC(T,  >  q)  fl  U  =  0.  For  a  state 
s  G  U  n  VC(4>,  q)  we  have  the  following  cases. 

1.  If  s  €  Si,  then  Supp(ir(s))  C  VC(<3?,  q).  This  is  because  in  the  game  G$,q  the  edges  of  player  1 
consists  of  edges  in  the  value  class  VC(<1>,  q) 

2.  If  s  €  Sq  and  s  G  Bnd(<E>,g),  then  it  means  that  U  fl  VC ($>,</)  0,  for  some  q'  >  q:  this  is 

because  E(s)  fl  VC(,  >  q)  /  0  for  s  €  Bnd(4>,  q)  and  U  is  closed.  This  is  not  possible  since 
by  assumption  on  U  we  have  U  fl  VC(4>,  >  q)  =  0.  Hence  we  have  s  G  Sq  D  (U  \  Bnd(<h,  q)), 
and  E(s)  C  VC(4>,g). 
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3.  If  s  G  S2,  then  since  U  n  VC(<P,>  q)  =  0,  it  follows  by  Proposition  1  that  Supp(7r(s))  C 

VC  ($,</). 

Hence  for  all  s  G  U  fl  VC($,g)  we  have  all  successors  of  U  in  Ga ;7r  are  in  VC(4 \q),  and  moreover 
U  n  Bnd(<P,c/)  =  0,  i.e. ,  U  is  contained  in  a  value  class  and  does  not  intersect  with  the  boundary 
probabilistic  states.  By  the  property  of  strategy  a,  if  U  D  (S  \  R2)  0,  then  for  all  s  G  U  we 

have  Pr^,7r(<h)  =  1:  this  is  because  for  all  r  >  0,  the  strategy  a  is  almost-sure  winning  for  objective 
<PUReach(Bnd(<P,r))  in  G$<r.  Since  a  is  a  fixed  strategy  and  ir  is  optimal  against  a,  it  follows  that 
if  ((1  ))vai(&)(s)  <  1,  then  Pr^,7r(<h)  <  1.  Hence  it  follows  that  U  fl  (S  \  (W\  U  W2))  =  0-  Hence 
the  recurrent  states  of  Ga^  are  contained  in  W\  U  W 2 ,  i.e.,  we  have  Pr^,7r (Reach [W\  U  W2))  =  1. 
Since  a  is  an  almost-sure  winning  strategy  in  W\ .  we  have  Pr^,7r(<h)  =  Pr^’ 77 (Reach (H^))-  Hence 
the  strategy  7 r  maximizes  the  probability  to  reach  W2  in  the  MDP  Ga. 

Analyzing  reachability  in  Ga.  Since  in  Ga  player  2  maximizes  the  probability  to  reachability  to  W2, 
we  analyze  the  player-2  MDP  Ga  with  objective  Reach(H/2)  for  player  2.  For  every  state  s  consider 
a  real-valued  variable  xs  =  1  —  ((l))„a;(<I>)(s)  =  ((2))uo/($)(s).  The  following  constraints  are  satisfied 

^  =  Etesupp^*))  **  VS1; 

^  =  EiGE(s)  xt  ■  ^(s)(0  s  e  SQ; 

xs  >  xt  s  G  52; 

x.s  ~  1  s  €  W2; 

The  first  equality  follows  as  for  all  r  €  [0, 1]  and  for  all  s  €  S  D  VC(<f>,r)  we  have  Supp(<r(s))  C 
VC(<h,r).  The  next  equality  and  the  first  inequality  follows  from  Proposition  1.  Since  the  values 
for  MDPs  with  reachability  objective  is  characterized  as  the  least  value  vector  satisfying  the  above 
constraints  [13],  it  follows  that  for  all  s  €  S  and  for  all  strategies  tti  G  n  we  have 

Prri(Reach(TT2))  <  =  ({2))val($)(s). 

Hence  we  have  Pr^7r(<I>)  <  ((2 ))vai(®)(s),  i.e.,  Pr^-P)  >  1  -  ((2 }}vai{$)(s)  =  «l))„ai($)(s).  Thus 
we  obtain  that  a  is  an  optimal  strategy.  I 

Muller  reduction  for  Given  a  Muller  winning  condition  T  and  the  objective  $  = 

Miiller(Jr),  we  consider  the  game  G$;r  with  the  objective  <P  U  Reach(Bnd(<P,  r))  for  player  1.  We 
present  a  simple  reduction  to  a  game  with  objective  d>.  The  reduction  is  achieved  as  follows:  with¬ 
out  loss  of  generality  we  assume  T  7^  0,  and  let  F  €  T  and  F  =  {  cf ,  cf1, . . . ,  cj  }.  We  construct  a 
game  graph  Gq>,r  with  objective  <P  for  player  1  as  follows:  convert  every  state  Sj  G  Bnd(<P,r)  to  a 
cycle  Uj  =  {  s{,  sJ2,  ■  ■  ■ ,  }  with  x(si)  =  cf,  i.e.,  once  Sj  is  reached  the  cycle  Uj  is  repeated  with 

x(U j)  G  F.  An  almost-sure  winning  strategy  in  with  objective  <P  U  Reach(Bnd(<P,  r)),  is  an 

almost-sure  winning  strategy  in  G< ^  r  with  objective  and  vice-versa.  The  present  reduction  along 
with  Lemma  4  and  Lemma  5  gives  us  Lemma  6.  Observe  that  Lemma  4  ensures  that  strategies 
satisfying  conditions  of  Lemma  5  exist.  Lemma  6  along  with  Theorem  3  gives  us  Theorem  6. 

Lemma  6  For  all  Muller  winning  conditions  F ,  the  following  assertions  hold. 

1.  If  the  family  of  pure  finite-memory  strategies  of  size  if  suffices  for  almost-sure  winning  on 
21 /2-player  game  graphs,  then  the  family  of  pure  finite-memory  strategies  of  size  if  suffices 
for  optimality  on  21 /2-player  game  graphs. 

2.  If  the  family  of  randomized  finite-memory  strategies  of  size  if  suffices  for  almost-sure  winning 
on  21/2-player  game  graphs,  then  the  family  of  randomized  finite-memory  strategies  of  size  if 
suffices  for  optimality  on  21/2-player  game  graphs. 
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Theorem  6  For  all  Muller  winning  conditions  T ,  the  family  of  pure  finite-memory  strategies  of 
size  mp  suffices  for  optimality  on  2\ /2-player  game  graphs. 

4.1  Complexity  of  quantitative  analysis 

In  this  section  we  consider  the  complexity  of  quantitative  analysis  of  2  i/^-player  games  with  Miiller 
objectives.  We  first  prove  some  properties  of  the  values  of  2  ^-player  games  with  Muller  objectives. 
We  start  with  a  lemma. 

Lemma  7  For  all  2l/2-player  game  graphs,  for  all  Muller  objectives  there  exist  optimal  strategies 
a  and  it  for  player  1  and  player  2  such  that  the  following  assertions  hold: 

1.  for  all  r  G  (0, 1),  for  all  s  G  VC(4>,r)  we  have  Pr^,7r(i?eac/i(Bnd(4>,  r)))  =  1; 

2.  for  all  s  G  S  we  have 

{Reach{W\  U  W2))  =  1; 

Pr  ^{ReachiW^)  =  «1  »uaZ($)(s);  Pr  (Reaeh(W2))  =  «2))wo,  (¥)(*);. 
where  Wi  =  ((1)) almost ($)  and  W2  =  ({2))aimost (4>). 

Proof.  Consider  an  optimal  strategy  a  that  satisfies  the  conditions  of  Lemma  5,  and  a  strategy 
7 r  that  satisfies  analogous  conditions  for  player  2.  For  all  r  G  (0, 1),  the  strategy  a  is  almost-sure 
winning  for  the  objective  d>  U  Reach(Bnd(4>,  r))  and  the  strategy  7r  is  almost-sure  winning  for  the 
objective  <f>  U  Reach(Bnd(<h,  r)),  in  the  game  G$yr.  Thus  we  obtain  that  for  all  r  G  (0, 1),  for  all 
s  G  VC(<3?,r)  we  have 

Pr^,7r(d>  U  Reach(Bnd($,  r)))  =  1;  and  PrJ,7r($  U  Reach(Bnd(d>,  r)))  =  1. 

It  follows  that  for  all  r  G  (0, 1),  for  all  s  G  VC (4>,r)  we  have 

Pr^,7r(Reach(Bnd(l>,  r)))  =  1. 

From  the  above  condition  it  easily  follows  that  for  all  s  G  S  we  have  Pr^,7r(Reach(IFi  U  W2))  =  1. 
Since  a  and  7r  are  optimal  strategies,  all  the  requirements  of  the  second  condition  are  fulfilled. 
Hence,  the  strategies  a  and  n  are  witness  strategies  to  prove  the  desired  result.  I 

Characterizing  values  for  2  ^-player  Muller  games.  We  now  relate  the  values  of  2 1 /9-player 
game  graphs  with  Muller  objectives  with  the  values  of  a  Markov  chain,  on  the  same  state  space, 
with  reachability  objectives.  Once  the  relationship  is  established  we  obtain  bound  on  preciseness 
of  the  values.  We  use  Lemma  7  to  present  two  transformations  to  Markov  chains. 

Markov  chain  transformation.  Given  a  2  ^-player  game  graph  G  =  ((S.  E),  ( S 1 ,  S2 ,  Sry  )■()) 
with  a  Muller  objective  4>,  let  W\  =  ((1)) almost  (4>)  and  W2  =  ({2))aimost  (4>)  be  the  set  of  almost-sure 
winning  states  for  the  players.  Let  a  and  7r  be  optimal  strategies  for  the  players  (obtained  from 
Lemma  7)  such  that 

1.  for  all  r  G  (0, 1),  for  all  s  G  VC (4>,r)  we  have  Pr^,7r(Reach(Bnd(<l>, r)))  =  1; 

2.  for  all  s  G  S  we  have 

PrJ,7r (Reach (Wi  U  W2))  =  1; 

Pr^(Reach(Wi))  =  «l))m/($)(s);  Pr^  (Reach  (W2))  =  {{2))val($)(s). 
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We  first  consider  a  Markov  chain  that  mimics  the  stochastic  process  under  a  and  n.  The  Markov 
chain  G  =  (S,  5)  =  MCi(G,  <1>)  with  the  transition  function  5  is  defined  as  follows: 

1.  for  s  £  W\  U  W-2  we  have  <S(s)(s)  =  1; 

2.  for  r  £  (0,1)  and  s  E  VC(<f>,r)  \  Bnd(d>,r)  we  have  5(s)(t )  =  Pr^,7r(Reach({  t  })),  for  t  € 
Bnd(d>,  r)  (since  for  all  s  £  VC(<h,r)  we  have  Prg’7r(Reach(Bnd(<I>,  r)))  =  1,  the  transition 
function  5  at  s  is  a  probability  distribution);  and 

3.  for  r  £  (0, 1)  and  s  £  Bnd(T,r)  we  have  8(s)(t.)  =  8(s)(t),  for  t  £  S. 

The  Markov  chain  G  mimics  the  stochastic  proces  under  a  and  n  and  yields  the  following  lemma. 

Lemma  8  For  all  21  ^-player  game  graphs  G  and  all  Muller  objectives  <h,  consider  the  Markov 
chain  G  =  MCi(G,  d>).  Then  for  all  s  £  S  we  have  ((l))vai(Q)(s)  =  Vrs(Reach(W\)),  that  is,  the 
value  for  &  in  G  is  equal  to  the  probability  to  reach  W\  in  the  Markov  chain  G. 

Second  transformation.  We  now  transform  the  Markov  chain  G  to  another  Markov  chain  G.  We 
start  with  the  observation  that  for  r  £  (0, 1),  for  all  states  s,  t  £  Bnd(<h,  r)  in  the  Markov  chain  G  we 
have  Prs(Reach(Wi))  =  Prt(Reach(Wi))  =  r.  Moreover,  for  r  £  (0,1),  every  state  s  £  Bnd(d>,r) 
has  edges  to  higher  and  lower  value  classes.  Hence  for  a  state  s  £  VC(<b,r)  \  Bnd(<J>,r)  if  we 
chose  a  state  tr  £  Bnd(d>,r)  and  make  the  transition  probability  from  s  to  tr  to  1,  the  probability 
to  reach  W\  does  not  change.  This  motivates  the  following  transformation:  given  a  21/2-player 
game  graph  G  =  ((S,  E),  (Si,  S2,  Sq),  5)  with  a  Muller  objective  d>,  let  W\  =  ((F)) almost  (<h)  and 
W2  =  ((2)) aimost  (T)  be  the  set  of  almost-sure  winning  states  for  the  players.  Let  a  and  7 r  be  optimal 
strategies  for  the  players  (obtained  from  Lemma  7)  such  that 

1.  for  all  r  £  (0, 1),  for  all  s  £  VC(<3?,r)  we  have  Pr^,7r(Reach(Bnd(d>, r)))  =  1; 

2.  for  all  s  £  S  we  have 

Pr^ (Reach (IT,  u  W2))  =  1; 

Pr^(Reach(Wi))  =  «l»m/($)(s);  Pr^  (Reach  (W2))  =  ((2))val($)(s). 

The  Markov  chain  G  =  (S,  6)  =  MC2(G*,  <b)  with  the  transition  function  5  is  defined  as  follows: 

1.  for  s  £  W±  U  W2  we  have  <S(s)(s)  =  1; 

2.  for  r  £  (0,1)  and  s  £  VC(<3?,r)  \  Bnd(T,r),  pick  f  £  Bnd(d>,r)  and  5(s)(t)  =  1;  and 

3.  for  r  £  (0, 1)  and  s  £  Bnd(T,  r)  we  have  S(s)(t.)  =  5(s)(t),  for  t  £  S. 

Observe  that  for  <5>o  =  {  S(s)(t)  \  s  £  Sq,  t  £  S,  S(s)(t.)  >  0  }  and  (5>o  =  {  S(s)(t)  \  s  £  S,  t.  £ 

5,  S(s)(t)  >  0  },  we  have  5>o  C  <$>0  U  {  1  },  i.e. ,  the  transition  probabilities  in  G  are  subset  of 

transition  probabilities  in  G.  Let 

P 

8U  =  max{  q  \  8(s)(t)  =  -  for  s  £  Sq  and  5(s)(t)  >  0  }; 

5U  =  max{  q  \  5(s)(t)  =  -  for  s  £  Sq  and  5(s)(t)  >  0  }. 

Since  5> 0  C  <5>0  U  {  1  },  it  follows  that  Su  <  Su.  The  following  lemma  is  immediate  from  Lemma  8 
and  the  equivalence  of  the  probabilities  to  reach  W\  in  G  and  G. 
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Lemma  9  For  all  21/2-player  game  graphs  G  and  all  Muller  objectives  <J>,  consider  the  Markov 
chain  G  =  MC2(G,  <f>).  Then  for  all  s  £  S  we  have  ((l))m^(<I>)(s)  =  Vrs(Reach(W\)),  that  is,  the 
value  for  $  in  G  is  equal  to  the  probability  to  reach  W\  in  the  Markov  chain  G. 

Lemma  10  is  a  result  from  [7]  (Lemma  2  of  [7]). 

Lemma  10  ([7])  Let  G  =  ((S',  E),  (Si,  S2,  Sq),  6)  be  21/2-player  game  graph  with  n  states  such 
that  every  state  has  at  most  two  successors  and  for  all  s  £  Sq  and  t  £  E(s)  we  have  6(s)(t)  =  V2. 
Then  for  all  R  C  S,  for  all  s  £  S  we  have 

{(l))vai{Reach(R))(s)  =  —  where  p,q  are  integers  withp,q  <  4n_1. 

The  results  of  [27]  showed  that  a  21/2-player  game  graph  G  =  ((S,  E),  (S\,  S2,  Sq),  5)  can  be 
reduced  to  an  equivalent  21/2-player  game  graph  G  =  (( S,E ),  (Si,  S2,  Sp),  6)  such  that  every  state 
s  £  S  has  at  most  two  successors  and  for  all  s’  £  Sp  and  t  £  E(s)  we  have  6(s)(t)  =  and 
| S|  =  2  -  |1?|  •  log5u.  Lemma  11  follows  from  this  reduction  and  Lemma  10. 

Lemma  11  ([27])  Let  G  =  ((S,  E),  (Si,  S2,  Sq),5)  be  21/2-player  game  graph.  Then  for  all  R  C  S , 
for  all  s  £  S  we  have 

((1  ))vai(Reach(R))(s)  =  -  where  p,q  are  integers  withp,q  <  42'lEl'log<5“  =  S^E L 

Lemma  12  For  all  2l/2-player  game  graphs  G  =  ((S,  E ),  (Si,  S2,  Sq),  5)  and  all  Muller  objectives 
<V,  for  all  states  s  £  S  \  (\V\  U  W2)  we  have 

((1  ))val{&){s)  =  ~  where  p,  q  are  integers  with  0  <  p  <  q  <  S^E^, 

where  W\  and  IV2  are  the  almost-sure  winning  states  for  player  1  and  player  2,  respectively. 

Proof.  Lemma  9  shows  the  values  of  the  game  G  can  be  related  to  the  values  of  reaching  a  set  of 
states  in  a  Markov  chain  G  defined  on  the  same  state  space,  and  also  we  have  Su  <  5U.  The  result 
on  the  bound  on  then  follows  from  Lemma  11  and  the  fact  that  Markov  chains  are  a  subclass  of 
2  ’/2-player  games.  I 

Lemma  13  Let  G  =  (( S,E ),  (S\,  S2,  Sq),  5)  be  a  21/2-player  game  with  a  Muller  objective  4>.  Let 
E  =  (Vo,  Vi,  V2, . . . ,  14)  be  a  partition  of  the  state  space  S,  and  let  ro  >  r±  >  r2  >  ■  ■  ■  >  rk  be 
k-rational  values  such  that  the  following  conditions  hold: 

L  Vo  =  ((1 ))  almost  (*&)  and  14  =  ((2 ))  almost  (*&)  i 

2.  ro  =  1  and  rk  =  0; 

3.  for  all  1  <  i  <  k  —  1  we  have  Bnd(Vi)  4  0  and  Vi  is  5-live; 

4 ■  for  all  1  <  i  <  k  —  1  and  all  s  £  S2  n  V,  we  have  E(s)  C  Uj<i  V); 

5.  for  all  1  <i  <  k  -  1  we  have  Vi  =  ((1 )) almost  U  Reach( Bnd(Lj)))  in  GB nd(v-); 

6.  let  xs  =  ri,  for  s  £  Vi,  and  for  all  s  £  Sq,  let  xs  satisfy  that  xs  =  YlteE(s)  xt  '  ^(s)(0- 
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Then  we  have  ((l))vai(<&)(s)  >  xs  for  all  s  €  S. 

Proof.  Let  a  be  a  finite-memory  strategy  with  memory  M  such  that  (a)  a  is  almost-sure  winning 
from  Vb;  and  (b)  for  all  1  <  i  <  k  —  1  and  s  G  V  and  all  strategies  ir  for  player  2  in  GBnd(y.)  we  have 
Pr^,7r($  U  Reach(Bnd(Pj))  =  1;  such  a  strategy  exists  since  condition  1  (Vo  =  ({i-}} almost ($))  and 
condition  5  are  satisfied.  Let  n  be  a  finite-memory  counter-optimal  strategy  for  player  2  in  Ga,  i.e. , 
7T  is  optimal  for  player  2  for  objective  $  in  Ga.  We  claim  that  for  all  1  <  i  <  k  —  1  and  for  all  s  G  Vj 
we  have  Prs,7r(Reach(Bnd(Vj)  U  (J ja^j))  =  1-  To  prove  the  claim,  assume  towards  contradiction 
that  for  some  1  <  i  <  k  —  1  and  s  G  V  we  have  Pr^,7r(Reach(Bnd(Vj)  U  U <  1-  Then 
since  condition  4  holds  we  would  have  Pr^,7r(Safe(Vj,  \  Bnd(Vj))  >  0.  If  Pr(r,7r(Safe(Vj  \  Bnd(Vj))  > 
0,  then  there  must  be  a  closed  connected  recurrent  set  C  in  GCTi7r  such  that  C  is  contained  in 
(Vj  \  Bnd(Vj))  x  M.  Hence  for  states  s  G  C  we  would  have  Pr-7r($)  =  1;  this  holds  since  we 
have  Pr^,7r(4>  U  Reach(Bnd(Vj)))  =  1.  This  contradicts  the  facts  that  n  is  counter-optimal  and 
V,  fl  ((1))  almost  (T)  =  0.  Thus  we  obtain  that  for  all  1  <  i  <  k  —  1  and  all  s  G  V,  we  have 
Pr^,7r(Reach(Bnd(Vi)uUJ<i  Vj))  =  1.  It  follows  that  for  all  s  £  S  we  have  Pr  y  ( Reach ( Vo  U 14 ) )  =  1. 
By  the  ordering  ro  >  r±  >  r2  >  ■  ■  ■  >  r*,,  condition  4,  and  condition  6,  it  follows  that  for  all  s  €  S 
we  have  Pr^,7r(Reach(I4))  <  1  —  xs;  this  follows  by  the  analysis  of  the  MDP  Ga  with  the  reachability 
objective  Reach  (Vj,.)  for  player  2.  Hence  we  have  PrJ,7r  (Reach  (Vo))  >  xs.  Since  a  is  almost-sure 
winning  from  Vo,  we  obtain  that  for  all  s  €  S  we  have  ((l))va;(4>)(s)  >  xs.  The  desired  result 
follows.  I 

A  PSPACE  algorithm  for  quantitative  analysis.  We  now  present  a  PSPACE  algorithm  for 
quantitative  analysis  for  21/2-player  games  with  Muller  objectives  Muller (J-).  A  PSPACE  lower 
bound  is  already  known  for  the  qualitative  analysis  of  2-player  games  with  Muller  objectives  [15]. 
To  obtain  an  upper  bound  we  present  a  NPSPACE  algorithm.  The  algorithm  is  based  on  Lemma  13. 
Given  a  21/2-player  game  G  =  (( S,E ),  (Si,  S2,  Sq),  5)  with  a  Muller  objective  <h,  a  state  s  and  a 
rational  number  r,  the  following  assertion  hold:  if  ((l))va;($)(s)  >  r,  then  there  exists  a  partition 

V  =  (Vo,  Vj ,  V2 , . . . ,  Vk)  of  S  and  rational  values  ?’o  >  ri  >  r2  >  . . .  >  ry0,  such  that  rt  =  with 

Pi,Qi  <  such  that  conditions  of  Lemma  13  are  satisfied,  and  s  €  V  with  r*  >  r.  The  witness 

V  is  the  value  class  partition  and  the  rational  values  represent  the  values  of  the  value  classes.  From 
the  above  observation  we  obtain  the  algorithm  for  quantitative  analysis  as  follows:  given  a  21/2- 
player  game  graph  G  =  (( S ,  E),  (Si,  S2,  Sq),  5)  with  a  Muller  objective  4>,  a  state  s  and  a  rational 
r,  to  verify  that  ((l))„a/((I>)(s)  >  r,  the  algorithm  guesses  a  partition  V  =  (Vo,  Vi,  V2, . . . ,  14)  of  S 
and  rational  values  tq  >  r\  >  r2  >  ■  ■  ■  >  ru  such  that  r,;  =  with  pt,  q%  <  S and  then  verifies 

Hi 

that  all  the  conditions  of  Lemma  13  are  satisfied,  and  s  €  Vj  with  r*  >  r.  Observe  that  since  the 
guesses  of  the  rational  values  can  be  made  with  0(|G|  •  |<Sj  •  | E\)  bits,  the  guess  is  polynomial  in  size 
of  the  game.  The  condition  1  and  the  condition  5  of  Lemma  13  can  be  verified  in  PSPACE  by  the 
PSPACE  qualitative  algorithms  (see  Theorem  5),  and  all  the  other  conditions  can  be  checked  in 
polynomial  time.  Since  NPSPACE=PSPACE  we  obtain  a  PSPACE  upper  bound  for  quantitative 
analysis  of  2  ^-player  games  with  Muller  objectives. 

Theorem  7  Given  a  21/2-player  game  G,  a  Muller  objective  <3?,  a  state  s,  and  a  rational  r  in 
binary,  it  is  PSPACE-complete  to  decide  if  ((l))vai(<&)(s)  >  r. 

4.2  The  complexity  of  union-closed  and  upward-closed  Muller  objectives 

We  now  consider  two  special  classes  of  Muller  objectives:  namely,  union-closed  and  upward-closed 
objectives.  We  will  show  the  quantitative  analysis  of  both  these  classes  of  objectives  in  21/2-player 
games  under  succinct  representation  is  co-NP-complete.  We  first  present  these  conditions. 
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1.  Union-closed,  and  basis  conditions.  A  Muller  winning  condition  F  C  V(C)  is  union-closed  if 
for  all  /,  J  G  JF  we  have  I U  J  G  F.  A  basis  condition  B  C  V(C),  given  as  a  set  23  specifies  the 
winning  condition  F  =  {  I  C  27  |  ELBi,  f?2,  ■  ■  • ,  Bp.  G  23.  Ui<j<fc  =  I  }•  A  Miiller  winning 
condition  F  can  be  specified  as  a  basis  condition  only  if  F  is  union-closed. 

2.  Upward-closed  and  superset  conditions.  A  Miiller  winning  condition  F  C  V(C)  is  upward- 
closed  if  for  all  /  G  F  and  I  C  J  C  (7  we  have  J  G  J.  A  superset  condition  22  C  V(C),  specihes 
the  winning  condition  2:={/CC|JCI  for  some  J  G  22  }.  A  Muller  winning  condition 
F  can  be  specihed  as  a  superset  condition  only  if  F  is  upward-closed.  Any  upward-closed 
condition  is  also  union-closed. 

The  results  of  [15]  showed  that  the  basis  and  superset  conditions  are  more  succinct  ways  to 
represent  union-closed  and  upward-closed  condtions,  respectively,  than  the  explicit  representation. 
The  following  proposition  was  also  shown  in  [15]  (see  [15]  for  the  formal  description  of  the  notion 
of  succinctness  and  translability). 

Proposition  2  ([15])  A  superset  condition  is  polynomially  translatable  to  an  equivalent  basis  con¬ 
dition. 

Strategy  complexity  for  union-closed  conditions.  We  observe  that  for  an  union-closed  ob¬ 
jective  IF,  the  Zielonka  tree  construction  ensures  that  my  =  1.  Then  from  Theorem  6  we  obtain 
that  for  objectives  Miiller(.F')  pure  memoryless  optimal  strategies  exist  in  2!/2-player  game  graphs, 
for  union-closed  conditions  F. 

Proposition  3  For  all  union-closed  winning  conditions  F  we  have  my  =  1;  and  pure  memoryless 
optimal  strategies  exist  for  objective  Muller(F )  for  all  21/2-player  game  graphs. 

Complexity  of  basis  and  superset  conditions.  The  results  of  [15]  established  that  deciding  the 
winner  in  2-player  games  (that  is  qualitative  analysis  for  2-player  game  graphs)  with  union-closed 
and  upward-closed  conditions  specified  as  basis  and  superset  conditions  is  coNP-complete.  The 
lower  bound  for  the  special  case  of  2-player  games,  yields  a  coNP  lower  bound  for  the  quantitative 
analysis  of  2 1/2-player  games  with  union-closed  and  upward-closed  conditions  specified  as  basis  and 
superset  conditions.  We  will  prove  a  matching  upper  bound.  We  prove  the  upper  bound  for  basis 
conditions,  and  by  Proposition  2  the  result  also  follows  for  superset  conditions. 

The  upper  bound  for  basis  games.  We  present  a  coNP  upper  bound  for  the  quantitative 
analysis  for  basis  games.  Given  a  21/2-player  game  graph  and  a  Muller  objective  $  =  Muller  (J7), 
where  F  is  union-closed  and  specified  as  a  basis  condtion  defined  by  23,  let  s  be  a  state  and  r 
be  a  rational  given  in  binary.  The  problem  whether  ((l))„a;($)(s)  >  r  can  be  decided  in  coNP. 
We  present  a  polynomial  witness  and  polynomial  time  verification  procedure  when  the  answer  to 
the  problem  is  “NO”.  Since  F  is  union-closed,  it  follows  from  Proposition  3  that  pure  memoryless 
optimal  strategy  7r  exists  for  player  2.  The  pure  memoryless  optimal  strategy  is  the  polynomial 
witness  to  the  problem,  and  once  n  is  fixed  we  obtain  a  lT/^-player  game  graph  G w.  To  present  a 
polynomial  time  verification  procedure  we  present  a  polynomial  time  algorithm  to  compute  values 
in  an  MDP  (or  IT/^-player  games)  with  basis  condition  B. 

Preliminaries  on  for  MDPs.  We  develop  some  facts  on  end  components  [8,  9]  that  will  be  useful 
tools  for  analysis  of  MDPs. 
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Definition  6  (End  component)  A  set  U  C  S  of  states  is  an  end  component  ifU  is  5-closed  and 
the  subgame  graph  G  \  U  is  strongly  connected.  I 

We  denote  by  £  C  2s  the  set  of  all  end  components  of  G.  The  next  lemma  states  that,  under 
any  strategy  (memoryless  or  not),  with  probability  1  the  set  of  states  visited  infinitely  often  along 
a  play  is  an  end  component.  This  lemma  allows  us  to  derive  conclusions  on  the  (infinite)  set  of 
plays  in  an  MDP  by  analyzing  the  (finite)  set  of  end  components  in  the  MDP. 

Lemma  14  [8,  9]  For  all  states  s  G  S  and  strategies  a  6S,  we  have  Pi (Muller(£))  =  1. 

Given  a  Muller  condition  F,  we  denote  by  U  =  £n{FCS\  x~l{T)  G  F  }  the  set  of  end 
components  that  are  Muller  sets.  These  are  the  winning  end  components.  Let  Tend  =  U  u&u  U 
be  their  union.  From  Lemma  14  and  Theorem  4  of  [2],  it  follows  that  the  maximal  probability  of 
satisfying  the  objective  Miiller(Jr)  is  equal  to  the  maximal  probability  of  reaching  the  union  of  the 
winning  end  components. 

Lemma  15  For  all  l1/^ -player  games  and  for  all  Muller  objectives  Miiller(F)  we  have 
((1  ))vai(Muller(F))  =  ((1 ))  vai(Reach(Tend)) . 

Maximal  end  components.  An  end  component  U  C  S  is  maximal  in  V  C  S  if  U  C  V,  and  if 
there  is  no  end  component  U'  with  U  C  U'  C  V.  Given  a  set  V  C  S',  we  denote  by  MaxEC(E)  the 
set  consisting  in  all  maximal  end  components  U  such  that  U  C  V. 

Polynomial  time  algorithm  for  MDPs  with  basis  condition.  Given  an  1  ^-player  game 
graph  G ,  let  £  be  the  set  of  end  components.  Consider  a  basis  condition  B  =  {  B\,  B2, . . . ,  }  C 

V(C),  and  let  F  be  the  union-closed  condition  generated  from  B.  The  set  of  winning  end- 
components  are  U=£n{FCS\  y_1(F)  G  F  },  and  let  Tend  =  (J  ueu^-  ^  follows  from 
Lemma  15  that  the  value  function  in  G  can  be  computed  by  computing  the  maximal  probability 
to  reach  Tend-  Once  the  set  Tend  is  computed,  the  value  function  for  reachability  objective  in 
1  y2-player  game  graphs  can  be  computed  in  polynomial  time  by  linear-programming  (see  [13]).  To 
complete  the  proof  we  present  a  polynomial  time  algorithm  to  compute  Tend. 

Computing  winning  end  components.  The  algorithm  is  as  follows.  Let  B  be  the  basis  for  the 
winning  condition  and  G  be  the  1 1/2-player  game  graph.  Initialize  £>o  =  B  and  repeat  the  following: 

1.  let  Xi  =  U BeBi  A 

2.  partition  the  set  X,;  into  maximal  end  components  MaxEC(Xj); 

3.  remove  an  element  B  of  Bt  such  that  y_1(il)  is  not  wholly  contained  in  a  maximal  end 
component  to  obtain  Bi+ 1; 

until  Bj  =  Bt-i .  When  B,  =  £>j_i,  let  X  =  Xi,  and  every  maximal  end  component  of  X  is  an  union 
of  basis  elements  (all  Y  in  X  are  members  of  basis  elements,  i.e.,  X-1(X)  G  B,  and  an  basis  element 
not  contained  in  any  maximal  end  component  of  X  is  removed  in  step  3).  Moreover,  any  maximal 
end  component  of  G  which  is  an  union  of  basis  elements  is  a  subset  of  an  maximal  end  component 
of  X ,  since  the  algorithm  preserves  such  sets.  Hence  we  have  X  =  Tend. ■  The  algorithm  requires 
\B\  iterations  and  each  iteration  requires  the  decomposition  of  an  1  ^-player  game  graph  into  the 
set  of  maximal  end  components,  which  can  be  achieved  in  OdS'l  •  \E\)  time  (see  [9]).  Hence  the 
algorithm  works  in  0(|£>|  •  151  •  | E\)  time.  This  completes  the  proof  and  yields  the  following  result. 
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Theorem  8  Given  a  2l /2-player  game  graph  and  a  Muller  objective  $  =  Muller (F),  where  F  is  an 
union-closed  condition  specified  as  a  basis  condtion  defined  by  B  or  F  is  an  upward-closed  condition 
specfied  as  a  superset  condition  U,  a  state  s  and  a  rational  r  given  in  binary,  it  is  coNP-complete 
to  decide  whether  ({l))vai($)(s)  >  r. 

5  An  Improved  Bound  for  Randomized  Strategies 

We  now  show  that  if  a  player  plays  randomized  strategies,  then  the  upper  bound  on  memory  for 
optimal  strategies  can  be  improved.  We  first  present  the  notions  of  an  upward  closed  restriction  of 
a  Zielonka  tree.  The  number  mfp  of  such  restrictions  of  the  Zielonka  tree  will  be  in  general  lower 
than  the  number  m ?  of  Zielonka  trees,  and  we  show  that  randomized  strategies  with  memory  of 
size  rrffp  suffices  for  optimality. 

Upward  closed  sets.  A  set  F  C  V(C)  is  upward  closed  if  for  all  F  €  F  and  all  F  C  F\  we  have 
F\  G  F,  i.e.,  if  a  set  F  is  in  F ,  then  all  supersets  F\  of  F  are  in  F  as  well. 

Upward  closed  restriction  of  Zielonka  tree.  The  upward  closed  restriction  of  a  Zielonka  tree 
for  a  Muller  winning  condition  F  C  V(C),  denoted  as  Zp  c,  is  obtained  by  making  upward  closed 
conditions  as  leaves.  Formally,  we  define  Zp  c  inductively  as  follows: 

1.  if  F  is  upward  closed,  then  Zip  c  is  leaf  labeled  F  (i.e.,  it  has  no  subtrees); 

2.  otherwise 

(a)  if  C  ft  F,  then  ZpC  =  Zy  c,  where  F  =  V(C)  \  F . 

(b)  if  C  E  F,  then  the  root  of  Zlp  c  is  labeled  with  C;  and  let  Co,  C\, . . . ,  C^-i  be  all  the 
maximal  sets  in  {  X  F  \  X  C  C  };  then  we  attach  to  the  root,  as  its  subtrees,  the 
Zielonka  upward  closed  restricted  trees  Zip  c  of  F  \  Ci,  i.e.,  Zip ,c.  c. ,  for  i  =  0, 1, . . . ,  k  — 

1. 

The  number  mff  for  Zp  c  is  the  number  defined  as  the  number  m ?  was  defined  for  the  tree  -Zjyc- 
We  will  prove  randomized  strategies  of  size  m/p  suffices  for  optimality.  To  prove  this  result, 
we  first  prove  that  randomized  strategies  of  size  rrffp  suffices  for  almost-sure  winning.  The  result 
then  follows  from  Lemma  6.  To  prove  the  result  for  almost-sure  winning  we  take  a  closer  look 
at  the  proof  of  Theorem  3.  The  inductive  proof  characterizes  that  if  existence  of  randomized 
memoryless  strategies  can  be  proved  for  21/2-player  games  with  Muller  winning  conditions  that 
appear  in  the  leaves  of  the  Zielonka  tree,  then  the  inductive  proof  generalizes  to  give  a  bound  as  in 
Theorem  3.  Hence  to  prove  an  upper  bound  of  size  mfp  for  almost-sure  winning,  it  suffices  to  show 
that  randomized  memory  less  strategies  suffices  for  upward  closed  Muller  winning  conditions.  In  [3] 
it  was  shown  that  for  all  2  i^-player  games  randomized  memoryless  strategies  suffices  for  almost-sure 
winning  for  upward  closed  objectives  (see  Appendix  for  a  proof).  This  gives  us  Theorem  9. 

Theorem  9  For  all  Muller  winning  conditions  F ,  the  family  of  randomized  finite-memory  strate¬ 
gies  of  size  mfp  suffices  for  optimality  on  21/2-player  game  graphs. 

Remark.  In  general  we  have  mfp  <  m. p.  Consider  for  example  F  C  V(C),  where  C  = 
{  ci,  c/2 ,  •  Cfc  }.  For  the  Muller  winning  condition  F  =  {  C  }.  We  have  mfp  =  1,  and  mjr  =  \C\. 
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6  Conclusion 


In  this  work  we  present  optimal  memory  bounds  for  pure  almost-sure,  positive  and  optimal  strate¬ 
gies  for  21/2-player  games  with  Muller  winning  conditions.  We  also  present  improved  memory 
bounds  for  randomized  strategies.  Unlike  the  results  of  [10]  our  results  do  not  extend  to  infinite 
state  games:  for  example,  the  results  of  [12]  showed  that  even  for  21/2-player  pushdown  games 
optimal  strategies  need  not  exist,  and  for  e  >  0  even  e-optimal  strategies  may  require  infinite 
memory.  For  lower  bound  of  randomized  strategies  the  constructions  of  [10]  do  not  work:  in  fact 
for  the  family  of  games  used  for  lower  bounds  in  [10]  randomized  memoryless  almost-sure  winning 
strategies  exist.  However,  it  is  known  that  there  exist  Muller  winning  conditions  T  C  V(C),  such 
that  randomized  almost-sure  winning  strategies  may  require  memory  |Cj!  [16].  However,  whether  a 
matching  lower  bound  of  size  m ^  can  be  proved  in  general,  or  whether  the  upper  bound  of  mffp  can 
be  improved  and  a  matching  lower  bound  can  be  proved  for  randomized  strategies  with  memory 
remains  open. 
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Appendix 


Theorem  10  ([3])  The  family  of  randomized  memoryless  strategies  suffices  for  almost-sure  win¬ 
ning  with  respect  to  upward  closed  objectives  on  21/2-player  game  graphs. 

Proof.  Consider  a  2  '/2-player  game  graph  G  and  the  game  (G,  C,  x,  E)  with  an  upward  closed 
objective  <i>  =  Muller^)  for  player  1,  i.e. ,  T  is  upward  closed.  Let  W\  =  ((1)) almost  (<h)  be  the 
set  of  almost-sure  winning  states  for  player  1  in  G.  We  have  S  \  W\  =  {(2))pos  (<i>)  and  hence  any 
almost-sure  winning  strategy  for  player  1  ensures  that  from  W\  the  set  S  \  W\  is  not  reached  with 
positive  probability.  Hence  we  only  require  to  consider  strategies  a  for  player  1  such  that  for  all 
w  G  Wf  and  s  G  W\  we  have  Supply;  •  s))  C  W\.  Consider  a  randomized  memoryless  strategy 
ex  for  player  1  such  that  for  a  state  s  G  W\  it  chooses  uniformly  at  random  all  successors  in  IV\ . 
Observe  that  for  a  state  s  G  (S2  U  Sq)  fl  W\  we  have  E(s )  C  W\ ;  otherwise  s  would  not  have 
been  in  W \ .  Consider  the  MDP  Ga  f  W \ .  Since  it  is  a  player-2  MDP  with  the  Muller  objective 
$  and  randomized  memoryless  optimal  strategies  exist  in  MDPs  [3],  we  fix  a  memoryless  counter- 
optimal  strategy  7r  for  player  2  in  Ga  \  W\ .  Now  consider  the  player-1  MDP  Gn  \  W\.  Consider  a 
memoryless  strategy  a'  in  Gn  f  W\ .  We  first  present  an  observation:  since  the  strategy  a  chooses 
all  successors  in  W\  uniformly  at  random  and  for  all  s  G  Wins’!  we  have  Supp(V(s))  C  Supp(cr(s)), 
it  follows  that  for  every  closed  recurrent  set  U'  in  the  Markov  chain  Ga'  ^  \  W\  there  is  a  closed 
recurrent  set  U  in  the  Markov  chain  Ga ;7r  \  W\  with  U'  C  U .  We  now  prove  that  a  is  an  almost-sure 
winning  strategy  by  showing  that  all  recurrent  set  of  states  U  in  Gat7r  \  W\  is  winning  for  player  1, 
i.e.,  x(U)  G  T .  Assume  towards  contradiction,  there  is  a  closed  recurrent  set  U  in  Ga,-K  \  W\  with 
x(U)  fL  T .  Consider  the  player-1  MDP  Gn  \  W\.  Since  randomized  memoryless  optimal  strategies 
exist  in  MDPs  [3],  we  fix  a  memoryless  counter-optimal  strategy  a'  for  player  1.  By  observation 
for  any  closed  recurrent  set  U'  in  Ga /j7r  such  that  U'  n  U  /  0  we  have  U'  C  U;  and  moreover, 
x{Ur)  C  x(U)  and  xW)  &  F 1  since  T  is  upward  closed  and  x(U)  fL  T .  It  then  follows  that  player  2 
wins  with  probability  1  in  from  a  non-empty  set  U'  (a  closed  recurrent  set  U'  C  U)  of  states  in  the 
Markov  chain  Ga'^.  Since  n  is  a  fixed  strategy  for  player  2  and  the  strategy  o'  is  counter-optimal 
for  player  1,  this  contradicts  that  U'  C  U  C  ((1)) almost  (<L).  It  follows  that  every  closed  recurrent  set 
U  in  Ga i7r  f  W 1  is  winning  for  player  1  and  the  result  follows.  I 
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